| next |
Copyright © 2005-06 Python Software Foundation
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
| top | next |
| prev | next |
for loops, if/then/else, …Python (version 2.3 or higher)Cygwin if you're on WindowsSubversionSoftware Carpentry for detailsPythonCode fragmentsCommands«Regular expressions»"Strings"<tags/> and attributesURLsProgram source
$ Shell commandstheir outputand error messages
Exercise 2.1:
What is the largest software project you have ever worked on? How well did it meet its original objectives? What is the most important thing you learned from it?
Exercise 2.2:
Write a point-form list of the programming tools you use on a regular basis. When and how did you learn each one? How proficient do you think you are with each? Compared to whom?
Exercise 2.3:
Suppose you have been given one week to write a program to translate old-style configuration files to a new syntax. Write a point-form description of how you would go about it.
Exercise 2.4:
Rewrite the following fragment of code to make it more readable. Don't worry about the fact that you don't know the language it's written in; feel free to use any functions or language features you're familiar with from other languages.
i = open('oldconfig.cnf', 'r');
ll = i.readlines();
for j in 0..len(ll) {
if len(j) > 0 {
if not defined(r) r = new list;
r.append(j);
}
}
sort(r);
print 'longest line is', r[0];
Exercise 2.5:
What are the errors in the function shown below? Don't worry about the lack of variable declarations: this language doesn't need them. Note that, like C and Java, this language uses 0 as the first index for lists.
# Calculate a running sum of a list of numbers.
# If the input values are [1, 2, 3], the final values are [1, 3, 6].
def running_sum(values) {
i = 1;
while (i < len(values)) {
values[i] = values[i] + values[i-1];
}
}Exercise 2.6:
A sub-contractor in Euphoristan has just written a function that takes two lists of phone numbers (represented as strings), and returns all those in the first list that are not in the second. You only have a few minutes to test it before she goes off-line for the weekend; what are the first half-dozen test cases you would try?
| prev | top | next |
| prev | next |
ls, cp, and wc do![[A Shell in Action]](./img/shell01/shell_screenshot.png)
Figure 3.1: A Shell in Action
sh, is an ancestor of many of thembash (the Bourne Again Shell) in this courseCygwin)![[Operating System]](./img/shell01/operating_system.png)
Figure 3.2: Operating System
notes.txt or home.html.txt is associated with an editor, and .html with a web browser![[A Directory Tree]](./img/shell01/directory_tree.png)
Figure 3.3: A Directory Tree
/C:\home\gvwilson\notes.txt is different from J:\home\gvwilson\notes.txtC:\home\gvwilson as c:/home/gvwilson/cygdrive/c/home/gvwilson":" a special meaning, so Cygwin needed a way to write paths without it…"/"/home/hpotter is Harry Potter's home directory/courses/swc/web/lec/shell.html is this file/courses/swc, the relative path to this file is web/lec/shell.html"." (pronounced “dot”) is the current directory".." (pronounced “dot dot”) is the directory one level up/courses/swc/data, .. is /courses/swc/courses/swc/data/elements, .. is /courses/swc/data![[Parent Directories]](./img/shell01/parent_directory.png)
Figure 3.4: Parent Directories
pwd (short for "print working directory”) to find out where you are$ pwd/home/hpotter/swc
ls (for “listing”) to see what's in the current directory$ lsLICENSE.txt conf data docs index.swc license.swc print.css swc.css testsMakefile config.mk depend.mk img lec press sites swc.dtd util
data directory, type ls data$ ls databio elements haiku.txt morse.txt pdb solarsystem.txt
cd data to “go into” datadatals on its owncd .. to go back to where you started$ cd data$ pwd/home/hpotter/swc/data$ lsbio elements haiku.txt morse.txt pdb solarsystem.txt$ cd ..$ pwd/home/hpotter/swc
ls, the OS:![[Running a Program]](./img/shell01/running_program.png)
Figure 3.5: Running a Program
ls produce more informative output by giving it some flags"-", as in "-c" or "-l"$ ls -FLICENSE.txt conf/ data/ docs/ index.swc license.swc print.css swc.css tests/Makefile config.mk depend.mk img/ lec/ press/ sites/ swc.dtd util/
.ls doesn't show things whose names begin with .. and .. don't always show up$ ls -a. .svn Makefile config.mk depend.mk img lec press sites swc.dtd util.. LICENSE.txt conf data docs index.swc license.swc print.css swc.css tests
.svn directory is for later$ ls -a -F. .svn/ Makefile config.mk depend.mk img/ lec/ press/ sites/ swc.dtd util/.. LICENSE.txt conf/ data/ docs/ index.swc license.swc print.css swc.css tests/
$ mkdir tmp
-v (“verbose”) would tell mkdir to print a confirmation message)$ cd tmp$ ls
earth.txt with the following contents:Name: Earth Period: 365.26 days Inclination: 0.00 Eccentricity: 0.02
venus.txt is to copy earth.txt and edit it$ cp earth.txt venus.txt$ edit venus.txt$ ls -tvenus.txt earth.txt
-t tells ls to list by modification time, instead of alphabeticallycat (short for “concatenate”)$ cat venus.txtName: VenusPeriod: 224.70 daysInclination: 3.39Eccentricity: 0.01
ls -l (“-l” meaning “long form”)$ ls -ltotal 2-rwxr-xr-x 1 gvwilson None 73 Jan 4 15:58 earth.txt-rwxr-xr-x 1 gvwilson None 73 Jan 4 15:58 venus.txt
wc (for “word count”)$ wc earth.txt venus.txt4 9 73 earth.txt4 9 73 venus.txt8 18 146 total
man | Documentation for commands. |
cat | Concatenate and display text files. |
cd | Change working directory. |
clear | Clear the screen. |
cp | Copy files and directories. |
date | Display the current date and time. |
diff | Show differences between two text files. |
echo | Print arguments. |
head | Display the first few lines of a file. |
ls | List files and directories. |
mkdir | Make directories. |
more | Page through a text file. |
mv | Move (rename) files and directories. |
od | Display the bytes in a file. |
passwd | Change your password. |
pwd | Print current working directory. |
rm | Remove files. |
rmdir | Remove directories. |
sort | Sort lines. |
tail | Display the last few lines of a file. |
uniq | Remove adjacent duplicate lines. |
wc | Count lines, words, and characters in a file. |
| Table 3.1: Basic Command-Line Tools | |
|---|---|
Exercise 3.1:
Suppose ls shows you this:
Makefile biography.txt data enrolment.txt programs thesis
What argument(s) will make it print the names in reverse, like this:
thesis programs enrolment.txt data biography.txt Makefile
Exercise 3.2:
What does the command cd ~ do? What about cd ~hpotter?
Exercise 3.3:
What command will show you the first 10 lines of a file? The first 25? The last 12?
Exercise 3.4:
What do the commands pushd, popd,
and dirs do? Where do their names come from?
Exercise 3.5:
How would you send the file earth.txt to the
default printer? How would you check it made it (other than
wandering over to the printer and standing there)?
Exercise 3.6:
The instructor wants you to use a hitherto unknown command for manipulating files. How would you get help on this command?
Exercise 3.7:
diff finds and displays the differences between
two text files. For example, if you modify earth.txt to
create a new file earth2.txt that contains:
Name: Earth Period: 365.26 days Inclination: 0.00 degrees Eccentricity: 0.02 Satellites: 1
you can then compare the two files like this:
$ diff earth.txt earth2.txt3c3< Inclination: 0.00---> Inclination: 0.00 degrees4a5> Satellites: 1
(The rather cryptic header "3c3" means that line 3 of
the first file must be changed to get line 3 of the second;
"4a5" means that a line is being added after line 4 of the
original file.)
What flag(s) should you give diff to tell it to
ignore changes that just insert or delete blank lines? What if
you want to ignore changes in case (i.e., treat lowercase and
uppercase letters as the same)?
| prev | top | next |
| prev | next |
stdin and stdout are$PATH is-rwxr-xr-x means* matches zero or more charactersls bio/*.txt lists all the text files in the bio directory$ ls bio/*.txtbio/albus.txt bio/ginny.txt bio/harry.txt bio/hermione.txt bio/ron.txt
? matches any single characterls jan-??.txt lists text files whose names start with “jan-” followed by two charactersls jan-??.* doesls can't tell whether it was invoked as ls *.txt or as ls earth.txt venus.txtta* does not find the tabulate commandcommand < input_file reads from input_file instead of from the keyboardcommand > output_file writes to output_file instead of to the screencommand < input_file > output_file does both![[Redirecting Standard Input and Output]](./img/shell02/redirection.png)
Figure 4.1: Redirecting Standard Input and Output
words.len:$ cd bio$ wc *.txt > words.len
words.lencat$ cat words.len7 66 468 albus.txt5 46 311 ginny.txt5 49 342 harry.txt5 49 331 hermione.txt6 54 364 ron.txt28 264 1816 total
cat > junk.txtcat reads from the keyboardrm junk.txt to get rid of the filerm * unless you're really, really sure that's what you want to do…sort words >wordssort then goes and reads the empty filewords are lostwc -w *.txt to count the words in some files, then sort -n to sort numerically$ wc -w *.txt > words.tmp$ sort -n words.tmp46 ginny.txt49 harry.txt49 hermione.txt54 ron.txt66 albus.txt264 total$ rm words.tmp
"|"$ wc -w *.txt | sort -n46 ginny.txt49 harry.txt49 hermione.txt54 ron.txt66 albus.txt264 total
![[Pipes]](./img/shell02/pipes.png)
Figure 4.2: Pipes
$ grep 'Title' spells.txt | sort | uniq -c | sort -n -r | head -10 > popular_spells.txt
set at the command prompt to get a listing:$ setBASH=/usr/bin/bashBASH_VERSION='2.05b.0(1)-release'COLUMNS=120HISTFILE=/home/.bash_historyHISTFILESIZE=500HISTSIZE=500HOME=/home/rweasleyHOSTNAME=hogwartsHOSTTYPE=i686LINES=60NUMBER_OF_PROCESSORS=1OSTYPE=cygwinPATH='/usr/local/bin:/usr/bin:/bin:/Python24:/home/rweasley/bin'PWD=/home/rweasleySHELL=/bin/bashUID=1003USER=rweasley
"$" in front of its namels $HOME is the same as ls /home/rweasley (if you're Ron Weasley)echo command to print out a variable's value$ echo $HOME/cygdrive/c/home/rweasley
echo $HOME, and not just $HOME?| Name | Typical Value | Notes |
|---|---|---|
COLUMNS | 80 | The width in characters of the current display window |
EDITOR | /bin/edit | Preferred editor |
HOME | /home/rweasley | The current user's home directory |
HOMEDRIVE | C: | The current user's home drive (Windows only) |
HOSTNAME | "ishad" | This computer's name |
HOSTTYPE | "i686" | What kind of computer this is |
LINES | 60 | The height in characters of the current display |
OS | "Windows_NT" | What operating system is running |
PATH | "/home/rweasley/bin:/usr/local/bin:/usr/bin:/bin:/Python24/" | Where to look for programs |
PWD | /home/rweasley/swc/lec | Present working directory (sometimes CWD, for current working directory) |
SHELL | /bin/bash | What shell is being run |
TEMP | /tmp | Where to store temporary files |
USER | "rweasley" | The current user's ID |
| Table 4.1: Important Environment Variables | ||
$ VILLAIN="Lord Voldemort"
$ VILLAIN="Lord Voldemort"$ bash$ echo $VILLAIN$ exit
![[Setting a Variable Without Export It]](./img/shell02/shell_no_export.png)
Figure 4.3: Setting a Variable Without Export It
$ VILLAIN="Lord Voldemort"$ export VILLAIN$ bash$ echo $VILLAINLord Voldemort$ exit
![[Exporting a Variable's Value]](./img/shell02/shell_with_export.png)
Figure 4.4: Exporting a Variable's Value
$ export VILLAIN="Lord Voldemort"$ bash$ echo $VILLAINLord Voldemort$ exit
~/.bashrc"~" is a shortcut meaning “your home directory”# Add personal tools directory to PATH. PATH=$HOME/bin:$PATH # Personal settings. export EDITOR=/local/bin/emacs export PRINTER=gryffindor-laserwriter # Change default behavior of commands. alias ls="ls -F"
.bashrc files can become very complex…ls won't show themPATH environment variables defines the shell's search pathbroom, the shell:$PATH into components to get a list of directoriesPATH is /home/rweasley/bin:/usr/local/bin:/usr/bin:/bin:/Python24/usr/local/bin/broom and /home/rweasley/bin/broom exist/home/rweasley/bin/broom will be run when you type broom at the command prompt/bin, /usr/bin: core tools like ls/usr/local/bin: optional (but common) tools, like the gcc C compiler$HOME/bin: tools you have built for yourself$HOME is your home directory. (the current working directory) in your pathwhatever, instead of ./whateverCygwin does things a little differently/cygdrive/c/somewhere instead of Windows' C:/somewhereC:/somewhere would clash with the colons in the PATH variableC:/cygwin as the root of its file system/home/rweasley is a synonym for C:/cygwin/home/rweasleygroups command will show you which ones you are inls -l shows this informationrwx triples"-"rw-rw-r-- means:tools has permission rwx--x--x, then:ls tools, permission is deniedtools/pfoldchmodchmod u+x broom allows broom's owner to run itchmod o-r notes.txt takes away the world's read permission for notes.txtnojunk#!/usr/bin/bash rm -f *.junk
man rm to find out what the “-f” flag does#!/usr/bin/bash means “run this using the Bash shell”#!rwxr-xr-x./nojunk$HOME/bin is in your search path, move it theretest/usr/bin/test./trychmod | Change file and directory permissions. |
du | Print the disk space used by files and directories. |
find | Find files with names that match patterns, that are of a certain age or size, etc. |
grep | Print lines matching a pattern. |
gunzip | Uncompress a file. |
gzip | Compress a file. |
lpr | Send a file to a printer. |
lprm | Remove a print job from a printer's queue. |
lpq | Check the status of a printer's queue. |
ps | Display running processes. |
tar | Archive files. |
which | Find the path to a program. |
who | See who is logged in. |
xargs | Execute a command for each line of input. |
| Table 4.2: Advanced Command-Line Tools | |
|---|---|
Exercise 4.1:
-rwxr-xr-x 1 aturing cambridge 69 Jul 12 09:17 mars.txt -rwxr-xr-x 1 ghopper usnavy 71 Jul 12 09:15 venus.txt
According to the listing of the data directory above,
who can read the file earth.txt? Who can write it (i.e.,
change its contents or delete it)? When was earth.txt
last changed? What command would you run to allow everyone to
edit or delete the file?
Exercise 4.2:
Suppose you want to remove all files whose names (not including
their extensions) are of length 3, start with the letter a, and
have .txt as extension. What command would you use? For
example, if the directory contains three files a.txt,
abc.txt, and abcd.txt, the command should remove
abc.txt , but not the other two files.
Exercise 4.3:
You're worried your data files can be read by your nemesis, Dr. Evil. How would you check whether or not he can, and if necessary change permissions so only you can read or write the files?
Exercise 4.4:
What's the difference between the commands cd HOME
and cd $HOME?
Exercise 4.5:
Suppose you want to list the names of all the text files in the
data directory that contain the word "carpentry". What
command or commands could you use?
Exercise 4.6:
Suppose you have written a program called analyze. What
command or commands could you use to display the first ten lines of
its output? What would you use to display lines 50-100? To send
lines 50-100 to a file called tmp.txt?
Exercise 4.7:
The command ls data > tmp.txt writes a listing of
the data directory's contents into tmp.txt. Anything
that was in the file before the command was run is overwritten. What
command could you use to append the listing to tmp.txt
instead?
Exercise 4.8:
What command(s) would you use to find out how many
subdirectories there are in the lectures directory?
Exercise 4.9:
What does rm *.ch? What about rm
*.[ch]?
Exercise 4.10:
What command(s) could you use to find out how many instances of
a program are running on your computer at once? For example, if you
are on Windows, what would you do to find out how many instances of
svchost.exe are running? On Unix, what would you do to
find out how many instances of bash are running?
Exercise 4.11:
A colleague asks for your data files. How would you archive them to send as one file? How could you compress them?
Exercise 4.12:
You have changed a text file on your home PC, and mailed it to the university terminal. What steps can you take to see what changes you may have made, compared with a master copy in your home directory?
Exercise 4.13:
How would you change your password?
Exercise 4.14:
grep is one of the more useful tools in the
toolbox. It finds lines in files that match a pattern and
prints them out. For example, assume the files
earth.txt and venus.txt contain lines like
this:
Name: Earth Period: 365.26 days Inclination: 0.00 Eccentricity: 0.02
grep can extract lines containing the text
"Period" from all the files:
$ grep Period *.txtearth.txt:Period: 365.26 daysvenus.txt:Period: 224.70 days
Search strings can use regular
expressions, which will be discussed in a Regular Expressions. grep takes many
options as well; for example, grep -c /bin/bash
/etc/passwd reports how many lines in /etc/passwd
(the Unix password file) that contain the string
/bin/bash, which in turn tells me how many users are
using bash as their shell.
Suppose all you wanted was a list of the files that
contained lines matching a pattern, rather than the matches
themselves—what flag or flags would you give to
grep? What if you wanted the line numbers of
matching lines?
Exercise 4.15:
Suppose you wanted ls to sort its output by
filename extension, i.e., to list all .cmd files before
all .exe files, and all .exe's before all
.txt files. What command or commands would you
use?
Exercise 4.16:
What does the alias command do? When would
you use it?
| prev | top | next |
| prev | next |
print statements![[Managing Multi-Author Collaboration]](./img/version/multi_author_collab.png)
Figure 5.1: Managing Multi-Author Collaboration
![[Version Control as a Time Machine]](./img/version/time_machine.png)
Figure 5.2: Version Control as a Time Machine
Perforce is excellentCVS and Subversion are:CVS has been around since the 1980sSubversion developed from 2000 onward as a workalike replacementsolarsystem project repositorysvn update to synchronize his working copy with the repositoryjupiter directory and creates moons.txtName Orbital Radius Orbital Period Mass Radius Io 421.6 1.769138 893.2 1821.6 Europa 670.9 3.551181 480.0 1560.8 Ganymede 1070.4 7.154553 1481.9 2631.2 Callisto 1882.7 16.689018 1075.9 2410.3
svn add moons.txt to bring it to Subversion's noticesvn commit to save his changes in the repositorysvn update on her working copySubversion sends her Ron's changes![[The Basic Edit/Update Cycle]](./img/version/edit_update_cycle.png)
Figure 5.3: The Basic Edit/Update Cycle
RapidSVN is a GUI that runs on Windows, Linux, and Mac![[RapidSVN]](./img/version/rapidsvn.png)
Figure 5.4: RapidSVN
TortoiseSVN is a Windows shell extension![[TortoiseSVN]](./img/version/tortoisesvn.png)
Figure 5.5: TortoiseSVN
Subversion) do thismoons.txt and commits his changes to create version 152Name Orbital Radius Orbital Period Mass Radius Io 421.6 1.769138 893.2 1821.6 Europa 670.9 3.551181 480.0 1560.8 Ganymede 1070.4 7.154553 1481.9 2631.2 Callisto 1882.7 16.689018 1075.9 2410.3 Amalthea 181.4 0.498179 0.075 131 x 73 x 67 Himalia 11460 250.5662 0.095 85 Elara 11740 259.6528 0.008 40
moons.txtName Orbital Radius Orbital Period Mass Radius
(10**3 km) (days) (10**20 kg) (km)
Io 421.6 1.769138 893.2 1821.6
Europa 670.9 3.551181 480.0 1560.8
Ganymede 1070.4 7.154553 1481.9 2631.2
Callisto 1882.7 16.689018 1075.9 2410.3
Amalthea 181.4 0.498179 0.075 131
Himalia 11460 250.5662 0.095 85
Elara 11740 259.6528 0.008 40
Pasiphae 23620 743.6 0.003 18
Sinope 23940 758.9 0.0008 14
Lysithea 11720 259.22 0.0008 12
Subversion tells her there's a conflict![[Merging Conflicts]](./img/version/conflict_merge.png)
Figure 5.6: Merging Conflicts
Subversion puts Hermione's changes and Ron's in moons.txtName Orbital Radius Orbital Period Mass Radius
(10**3 km) (days) (10**20 kg) (km)
Io 421.6 1.769138 893.2 1821.6
Europa 670.9 3.551181 480.0 1560.8
Ganymede 1070.4 7.154553 1481.9 2631.2
Callisto 1882.7 16.689018 1075.9 2410.3
<<<<<<< .mine
Amalthea 181.4 0.498179 0.075 131
Himalia 11460 250.5662 0.095 85
Elara 11740 259.6528 0.008 40
Pasiphae 23620 743.6 0.003 18
Sinope 23940 758.9 0.0008 14
Lysithea 11720 259.22 0.0008 12
=======
Amalthea 181.4 0.498179 0.075 131 x 73 x 67
Himalia 11460 250.5662 0.095 85
Elara 11740 259.6528 0.008 40
>>>>>>> .r152
<<<<<<< shows the start of the section from the first file======= divides sections>>>>>>> shows the end of the section from the second fileSubversion also creates:moons.txt.mine: contains Hermione's changesmoons.txt.151: the file before either set of changesmoons.txt.152: the most recent version of the file in the repositorysvn revert moons.txt to throw away her changesmoons.txtmoons.txt to remove the conflict markerssvn resolved moons.txt to let Subversion know she's donesvn commit to commit her changes (creating version 153 of the repository)svn diff shows him which files he has changed, and what those changes aresvn revert to discard his workmoons.txtsvn log shows recent historysvn merge -r 157:156 moons.txt will do the trick-r flag specifies the revisions involved![[Rolling Back]](./img/version/rollback.png)
Figure 5.7: Rolling Back
/svn/rotor)cd /svnsvnadmin create rotorsvn checkout file:///svn/rotorsvn checkout http://www.hogwarts.edu/svn/rotorsvn checkout once, to initialize your working copysvn update in that directorysvn co http://www.hogwarts.edu/svn/rotor/engine/dynamics| Name | Purpose | Example |
|---|---|---|
svn add | Add files and/or directories to version control. | svn add newfile.c newdir |
svn checkout | Get a fresh working copy of a repository. | svn checkout https://your.host.name/rotor/repo rotorproject |
svn commit | Send changes from working copy to repository (inverse of update). | svn commit -m "Comment on the changes" |
svn delete | Delete files and/or directories from version control. | svn delete oldfile.c |
svn help | Get help (in general, or for a particular command). | svn help update |
svn log | Show history of recent changes. | svn log --verbose *.c |
svn merge | Merge two different versions of a file into one. | svn merge -r 18:16 spin.c |
svn mkdir | Create a new directory and put it under version control. | svn mkdir newmodule |
svn rename | Rename a file or directory, keeping track of history. | svn rename temp.txt release_notes.txt |
svn revert | Undo changes to working copy (i.e., resynchronize with repository). | svn revert spin.h |
svn status | Show the status of files and directories in the working copy. | svn status |
svn update | Bring changes from repository into working copy (inverse of commit). | svn update |
| Table 5.1: Common Subversion Commands | ||
svn status compares your working copy with the repository$ svn statusM jupiter/moons.txtC readme.txt
jupiter/moons.txt has been modifiedreadme.txt has conflictssvn update prints one line for each file or directory it does something to$ svn updateA saturn/moons.txtU mars/mars.txt
saturn/moons.txt has been addedmars/mars.txt has been updated (i.e., someone else modified it)Exercise 5.1:
Follow the instructions given to you by your instructor to
check out a copy of the Subversion repository you'll be using in
this course. Unless otherwise noted, the exercises below
assume that you have done this, and that your working copy is in
a directory called course. You will submit all of your
exercises in this course by checking files into your
repository.
Exercise 5.2:
Create a file course/ex01/bio.txt (where
course is the root of your working copy of your
Subversion repository), and write a short biography of yourself
(100 words or so) of the kind used in academic journals,
conference proceedings, etc. Commit this file to your
repository. Remember to provide a meaningful comment when
committing the file!
Exercise 5.3:
What's the difference between mv and svn
mv? Put the answer in a file called
course/ex01/mv.txt and commit your changes.
Once you have committed your changes, type svn
log in your course directory. If you didn't know
what you'd just done, would you be able to figure it out from
the log messages? If not, why not?
Exercise 5.4:
In this exercise, you'll simulate the actions of two people editing a single file. To do that, you'll need to check out a second copy of your repository. One way to do this is to use a separate computer (e.g., your laptop, your home computer, or a machine in the lab). Another is to make a temporary directory, and check out a second copy of your repository there. Please make sure that the second copy isn't inside the first, or vice versa—Subversion will become very confused.
Let's call the two working copies Blue and Green. Do the following:
a) Create Blue/ex01/planets.txt, and add the
following lines:
Mercury Venus Earth Mars Jupiter Saturn
Commit the file.
b) Update the Green repository. (You should get a copy of
planets.txt.)
c) Change Blue/ex01/planets.txt so that it reads:
1. Mercury 2. Venus 3. Earth 4. Mars 5. Jupiter 6. Saturn
Commit the changes.
d) Edit Green/ex01/planets.txt so that its contents
are as shown below. Do not do svn update
before editing this file, as that will spoil the
exercise.
Mercury 0 Venus 0 Earth 1 Mars 2 Jupiter 16 (and counting) Saturn 14 (and counting)
e) Now, in Green, do svn update. Subversion
should tell you that there are conflicts in planets.txt.
Resolve the conflicts so that the file contains:
1. Mercury 0 2. Venus 0 3. Earth 1 4. Mars 2 5. Jupiter 16 6. Saturn 14
Commit the changes.
f) Update the Blue repository, and check that
planets.txt now has the same content as it has in the
Green repository.
Exercise 5.5:
Add another line or two to course/ex01/bio.txt and
commit those changes. Then, use svn merge to restore
the original contents of your biography
(course/ex01/bio.txt), and commit the result. When you
are done, bio.txt should look the way it did at the end
of the first part of the previous exercise.) Note: the purpose
of this exercise is to teach you how to go back in time to get
old versions of files—while it would be simpler in this
case just to edit bio.txt, you can't (reliably) do that
when you've made larger changes, to multiple files, over a
longer period of time.
Exercise 5.6:
Subversion allows users to set properties on files and
directories using svn propset, and to inspect their
values using svn propget. Describe three properties
you might want to change on a file or directory, and how you
might use them in your current project.
| prev | top | next |
| prev | next |
gcc -c -Wall -ansi -I/pkg/chempak/include dat2csv.c once is bad enoughMakeMake is freely available for every major platform, and very well documentedMake's syntaxTime: 1.2271 Concentration: 0.0050 Yield: 11.41 Time: 2.5094 Concentration: 0.0055 Yield: 11.20 Time: 3.7440 Concentration: 0.0060 Yield: 10.90
dat2csvhello.mk:hydroxyl_422.csv : hydroxyl_422.dat dat2csv hydroxyl_422.dat > hydroxyl_422.csv
make -f hello.mkmake -f hello.mk againhydroxyl_422.csv is newer than hydroxyl_422.dat, Make does not run the command again![[Structure of a Make Rule]](./img/build/rule_structure.png)
Figure 6.1: Structure of a Make Rule
hydroxyl_422.csv is the target of the rulehydroxyl_422.dat is its prerequisiteMake runs them on your behalf, just as the shell runs the command you typehydroxyl_422.csv : hydroxyl_422.dat dat2csv hydroxyl_422.dat > hydroxyl_422.csv methyl_422.csv : methyl_422.dat dat2csv methyl_422.dat > methyl_422.csv
make -f double.mk, only hydroxyl_422.csv is compiledMake will updatemake -f double.mk methyl_422.csv to build methyl_422.csvMake separately for each target would hardly count as “automation”all : hydroxyl_422.csv methyl_422.csv hydroxyl_422.csv : hydroxyl_422.dat dat2csv hydroxyl_422.dat > hydroxyl_422.csv methyl_422.csv : methyl_422.dat dat2csv methyl_422.dat > methyl_422.csv
make -f phony.mk all now creates both .csv filesall depends on hydroxyl_422.csv and methyl_422.csv.dat fileMake's built-in processing cycle:Make can execute actions in any order it wants to, as long as it doesn't violate dependency orderinghydroxyl_422.cv or methyl_422.csv firstallmake with no arguments, it automatically looks for a file called Makefilemake only updates the first one it finds"all": recompile everything"clean": delete all temporary files, and everything produced by compilation"install": copy files to system directoriesmake configuremakemake testmake installMake defines automatic variables to represent parts of rules"$@" | The rule's target |
"$<" | The rule's first prerequisite |
"$?" | All of the rule's out-of-date prerequisites |
"$^" | All prerequisites |
| Table 6.1: Automatic Variables in Make | |
|---|---|
all : hydroxyl_422.csv methyl_422.csv hydroxyl_422.csv : hydroxyl_422.dat @dat2csv $< > $@ methyl_422.csv : methyl_422.dat @dat2csv $< > $@ clean : @rm -f *.csv
Make echoes actions before executing them"@" at the start of the action line prevents thisclean to tidy up generated filesrm -f instead of just rm?all : hydroxyl_422.csv methyl_422.csv %.csv : %.dat @dat2csv $< > $@ clean : @rm -f *.csv
"%" represents the stem of the file's name in the target and prerequisitessummarize to combine data from hydroxyl_422.csv and hydroxyl_480.csvhydroxyl_all.csvall : hydroxyl_all.csv methyl_all.csv %_all.csv : %_422.csv %_480.csv summarize $^ > $@ %.csv : %.dat dat2csv dat2csv $< > $@ clean : @rm -f *.csv
%_all.csv takes precedence over the rule for %.csvMake uses the most specific rule available$ make -f depend.mkdat2csv hydroxyl_422.dat > hydroxyl_422.csvdat2csv hydroxyl_480.dat > hydroxyl_480.csvsummarize hydroxyl_422.csv hydroxyl_480.csv > hydroxyl_all.csvdat2csv methyl_422.dat > methyl_422.csvdat2csv methyl_480.dat > methyl_480.csvsummarize methyl_422.csv methyl_480.csv > methyl_all.csvrm hydroxyl_480.csv methyl_422.csv hydroxyl_422.csv methyl_480.csv
Make automatically removes intermediate files created by pattern rules when it's doneMake is a little programming languageINPUT_DIR = /lab/gamma2100
OUTPUT_DIR = /tmp
all : ${OUTPUT_DIR}/hydroxyl_all.csv ${OUTPUT_DIR}/methyl_all.csv
${OUTPUT_DIR}/%_all.csv : ${OUTPUT_DIR}/%_422.csv ${OUTPUT_DIR}/%_480.csv
@summarize $^ > $@
${OUTPUT_DIR}/%.csv : ${INPUT_DIR}/%.dat
@dat2csv $< > $@
clean :
@rm -f *.csv
"$" in front of the name and parentheses or braces around it$(XYZ) or ${XYZ}Make interprets "$XYZ" as the value of "X", followed by the characters "YZ"Make when invoking itname=value pairs on the command linemake -f macro.mk sets INPUT_DIR to /lab/gamma2100make INPUT_DIR=/newlab -f macro.mk uses /newlabMake also looks at environment variables${HOME} in a Makefile without having defined itVAL = original
echo :
@echo "VAL is" ${VAL}
$ make -f env.mk echoVAL is original$ make VAL=changed -f env.mk echoVAL is changed
addprefix and addsuffix to build a list of filenameshydroxyl into /tmp/hydroxyl_all.csv and methyl into /tmp/methyl_all.csvINPUT_DIR = /lab/gamma2100
OUTPUT_DIR = /tmp
CHEMICALS = hydroxyl methyl
SUMMARIES = $(addprefix ${OUTPUT_DIR}/,$(addsuffix _all.csv,${CHEMICALS}))
all : ${SUMMARIES}
${OUTPUT_DIR}/%_all.csv : ${OUTPUT_DIR}/%_422.csv ${OUTPUT_DIR}/%_480.csv
@summarize $^ > $@
${OUTPUT_DIR}/%.csv : ${INPUT_DIR}/%.dat
@dat2csv $< > $@
clean :
@rm -f *.csv
| Function | Purpose |
|---|---|
$(addprefix prefix,filenames) | Add a prefix to each filename in a list |
$(addsuffix suffix,filenames) | Add a suffix to each filename in a list |
$(dir filenames) | Extract the directory name portion of each filename in a list |
$(filter pattern,text) | Keep words in text that match pattern |
$(filter-out pattern,text) | Keep words in text that don't match pattern |
$(patsubst pattern,replacement,text) | Replace everything that matches pattern in text |
$(sort text) | Sort the words in text, removing duplicates |
$(strip text) | Remove leading and trailing whitespace from text |
$(subst from,to,text) | Replace from with to in text |
$(wildcard pattern) | Create a list of filenames that match a pattern |
| Table 6.2: Commonly-Used Functions | |
echo to print things as Make executesdel or rm to delete files?Ant: primary for Java, but equivalent tools now exist for .NETSConsCruiseControl and BittenExercise 6.1:
Make gets definitions from environment variables,
command-line parameters, and explicit definitions in Makefiles.
What order does it check these in?
| prev | top | next |
| prev | next |
![[Human Time vs. Machine Time]](./img/py01/human_vs_machine_time.png)
Figure 7.1: Human Time vs. Machine Time
PythonNumeric package isn't badPython Cookbook for the on-line version![[Sturdy vs. Nimble Execution]](./img/py01/sturdy_vs_nimble.png)
Figure 7.2: Sturdy vs. Nimble Execution
$ pythonPython 2.4.2 (#67, Sep 28 2005, 12:41:11) [MSC v.1310 32 bit (Intel)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> print 124/284>>> print 124.0/28.04.4285714285714288>>> ^D
"^D" represents control-D, which is Unix's way of saying “end of input”.py extension, and type python filename.py$ cat saved.pyprint 124/28print 124.0/28.0$ python saved.py44.42857142857
#!/usr/bin/python the first line of the program/usr/bin/python with the rest of the file as its inputwhich python to find out#!/usr/bin/env python as the first line/usr/bin/env to find Python, then run the script with it$ cat hashbang.py#!/usr/bin/env pythonprint 124/28print 124.0/28.0$ hashbang.py44.42857142857
.py files with Python.py will then run itplanet = "Pluto" moon = "Charon" p = planet
![[Variables Refer to Values]](./img/py01/vars_values.png)
Figure 7.3: Variables Refer to Values
planet = "Pluto" moon = "Charon" p = planet planet = 9
![[Variables Are Untyped]](./img/py01/vars_untyped.png)
Figure 7.4: Variables Are Untyped
planet = "Sedna" print plant # note the misspelling
Traceback (most recent call last):
File "lec/inc/py01/undefined_var.py", line 2, in ?
print plant # note the misspelling
NameError: name 'plant' is not defined
"#" to the end of the line is a commentx = "two" # "two" is a string y = 2 # 2 is an integer print x * y # multiplying a string concatenates it repeatedly print x + y # but you can't add an integer and a string
twotwoTraceback (most recent call last):File "lec/inc/py01/add_int_str.py", line 4, in ?print x + y # but you can't add an integer and a stringTypeError: cannot concatenate 'str' and 'int' objects
print statement prints zero or more values to standard outputprint on its own just prints a blank lineplanet = "Pluto" num_moons = 1 moon = "Charon" print planet, "has", num_moons, "satellite", print "and its name is", moon
Pluto has 1 satellite and its name is Charon
print "He said, \"It ain't what you know, it's what you can.\""
He said, "It ain't what you know, it's what you can."
print "Sedna was discovered in 2004" print 'It takes 10,500 years to circle the sun.' print '''The tiny world may be part of the Oort Cloud, a shell of icy proto-comets left over from the formation of the Solar System.'''
str converts things to stringsprint "Diameter: " + str(1280) + "-" + str(1760) + " km"
Diameter: 1280-1760 km
int, float, etc. to convert values to other typesprint int(12.3) print float(4)
12 4.0
"\t" and newline "\n"| Expression | Meaning |
|---|---|
\\ | backslash |
\' | single quote |
\" | double quote |
\b | backspace |
\n | newline |
\r | carriage return |
\t | tab |
| Table 7.1: Character Escape Sequences | |
14 is an integer (32 bits long on most machines)14.0 is double-precision floating point (64 bits long)1+4j is a complex number (two 64-bit values)x.real and x.imag to get the real and imaginary parts123456789L is a long integer** for exponentiation| Name | Operator | Use | Value | Notes |
|---|---|---|---|---|
| Addition | + | 35 + 22 | 57 | |
'Py' + 'thon' | 'Python' | |||
| Subtraction | - | 35 - 22 | 13 | |
| Multiplication | * | 3 * 2 | 6 | |
'Py' * 2 | 'PyPy' | 2 * 'Py' is illegal | ||
| Division | / | 3.0 / 2 | 1.5 | |
3 / 2 | 1 | Integer division rounds down: -3 / 2 is -2, not -1 | ||
| Exponentiation | ** | 2 ** 0.5 | 1.4142135623730951 | |
| Remainder | % | 13 % 5 | 3 | |
| Table 7.2: Numeric Operators in Python | ||||
x += 3 does the same thing as x = x + 35 += 3 is an error, since you can't assign a new value to 5…True and False are true and false (d'oh)and, or, not| Expression | Result | Notes |
|---|---|---|
True or False | True | |
True and False | False | |
'a' or 'b' | 'a' | or is true if either side is true, so it stops after evaluating 'a' |
'a' and 'b' | 'b' | and is only true if both sides are true, so it doesn't stop until it has evaluated 'b' |
0 or 'b' | 'b' | 0 is false, but 'b' is true |
0 and 'b' | 0 | Since 0 is false, Python can stop evaluating there |
0 and (1/0) | 0 | 1/0 would be an error, but Python never gets that far |
(x and 'set') or 'not set' | It depends | If x is true, this expression's value is 'set'; if x is false, it is 'not set' |
| Table 7.3: Boolean Operators in Python | ||
and and or are short-circuit operatorsTrue or False)![[Short-Circuit Evaluation]](./img/py01/short_circuit_eval.png)
Figure 7.5: Short-Circuit Evaluation
val = cond and left or rightcond is True, val is assigned leftcond is False, val is assigned right| Expression | Value |
|---|---|
3 < 5 | True |
3.0 < 5 | True |
3 != 5 | True |
3 == 5 | False |
3 < 5 <= 7 | True |
3 < 5 >= 2 | True (but please don't write this—it's hard to read) |
3+2j < 5 | Error: can only use == and != on complex numbers |
| Table 7.4: Comparison Operators in Python | |
= for assignment== to test if two things have equal values| Expression | Value |
|---|---|
'abc' < 'def' | True |
'abc' < 'Abc' | False |
'ab' < 'abc' | True |
'0' < '9' | True |
'100' < '2' | True |
| Table 7.5: String Comparisons in Python | |
if, elif (not else if), and elsea = 3
if a < 0:
print 'less'
elif a == 0:
print 'equal'
else:
print 'greater'
greater
begin/end or {…}?num_moons = 3
while num_moons > 0:
print num_moons
num_moons -= 1
3 2 1
print 'before'
num_moons = -1
while num_moons > 0:
print num_moons
num_moons -=1
print 'after'
before after
num_moons = 3
while num_moons > 0:
print num_moons
# oops --- forgot to subtract one
3 3 3⋮
breaknum_moons = 3
while True: # Looks like an infinite loop...
print num_moons
num_moons -= 1
if num_moons <= 1:
break # ...but there's a way out
3 2
continuenum_moons = 5
while num_moons > 0:
print 'top:', num_moons
num_moons -= 1
if (num_moons % 2) == 0:
continue
print '...bottom:', num_moons
top: 5 top: 4 ...bottom: 3 top: 3 top: 2 ...bottom: 1 top: 1
% operator formats output'here %s go' % 'we' creates "here we go""%s" in the left string means “insert a string here”'left %d right %d' % (-1, 1) creates "left -1 right 1""%d" stands for “decimal integer”'%04d' % 13 creates "0013"'[%-4d]' % 13 creates "[13 ]"'%6.4f %%' % 37.2 creates "37.2000 %""%%" is translated into a single "%""\\" is how you represent a single "\" in a string| Format | Meaning | Example | Output |
|---|---|---|---|
"d" | Signed decimal integer | '%d %d' % (13, 15) | "13 15" |
"o" | Unsigned octal (base-8) | '%o %o' % (13, 15) | "15 17" |
"x" | Lower case unsigned hexadecimal (base-16) | '%x %x' % (13, 15) | "d f" |
"X" | Upper case unsigned hexadecimal (base-16) | '%X %X' % (13, 15) | "D F" |
"e" | Lower case exponential floating point | '%e' % 123.45 | "1.234500e+02" |
"E" | Upper case exponential floating point | '%E' % 123.45 | "1.234500E+02" |
"f" | Decimal floating point | '%f' % 123.45 | "123.4500" |
"s" | String (converts other types using str()) | '%s %s %s' % ('nickel', 28, 58.69) | "nickel 28 58.69" |
| Table 7.6: String Formats in Python | |||
| prev | top | next |
| prev | next |
str[1:-1] meanstext[0] is the first character of textlen returns the length of a stringtext has index len(text)-1element = "boron"
i = 0
while i < len(element):
print element[i]
i += 1
b o r o n
$ python>>> element = 'gold' >>> print 'element is', elementelement is gold>>> element[0] = 's'TypeError: object does not support item assignment
element = 'gold' print 'element is', element element = 'lead' print 'element is now', element
element is gold element is now lead
text[start:end] takes a slice out of texttext from start up to (but not including) endelement = "helium" print element[1:3], element[:2], element[4:]
el he um
![[Visualizing Indices]](./img/py02/indices.png)
Figure 8.1: Visualizing Indices
$ python>>> element = 'helium' >>> print element[1:22]elium>>> x = element[22]IndexError: string index out of range
x[-1] is the last characterx[-2] is the second-to-last characterelement = "carbon" print element[-2], element[-4], element[-6]
o r c
x[len(x)-1]![[Visualizing Negative Indices]](./img/py02/negative_indices.png)
Figure 8.2: Visualizing Negative Indices
text[1:2] is either:text…text doesn't have a second character)text[2:1] is always the empty stringtext[1:1]text[1:-1] is everything except the first and last charactersmeth of object obj, type obj.meth()| Method | Purpose | Example | Result |
|---|---|---|---|
capitalize | Capitalize first letter of string | "text".capitalize() | "Text" |
lower | Convert all letters to lowercase. | "aBcD".lower() | "abcd" |
upper | Convert all letters to uppercase. | "aBcD".upper() | "ABCD" |
strip | Remove leading and trailing whitespace (blanks, tabs, newlines, etc.) | " a b ".strip() | "a b" |
lstrip | Remove whitespace at left (leading) edge of string. | " a b ".lstrip() | "a b " |
rstrip | Remove whitespace at right (trailing) edge of string. | " a b ".rstrip() | " a b" |
count | Count how many times one string appears in another. | "abracadabra".count("ra") | 2 |
find | Return the index of the first occurrence of one string in another, or -1. | "abracadabra".find("ra") | 2 |
"abracadabra".find("xyz") | -1 | ||
replace | Replace occurrences of one string with another. | "abracadabra".replace("ra", "-") | "ab-cadab-" |
| Table 8.1: String Methods | |||
element = 'helium'
print element.upper()
print element.replace('el', 'afn')
print 'element after calls:', element
HELIUM hafnium element after calls: helium
element = "cesium" print ':' + element.upper()[4:7].center(10) + ':'
: UM :
in to check whether one string appears in anotherfind methodprint "ant" in "tantalum" print "mat" in "tantalum"
True False
[]gases = ['He', 'Ne', 'Ar', 'Kr'] print gases print gases[0], gases[-1]
['He', 'Ne', 'Ar', 'Kr'] He Kr
x[i] = vgases = ['He', 'Ne', 'Ar', 'Kr'] print 'before:', gases gases[0] = 'H' gases[-1] = 'Xe' print 'after:', gases
before: ['He', 'Ne', 'Ar', 'Kr'] after: ['H', 'Ne', 'Ar', 'Xe']
$ python>>> gases = ['He', 'Ne', 'Ar', 'Kr'] >>> print 'before:', gasesbefore: ['He', 'Ne', 'Ar', 'Kr']>>> gases[10] = 'Ra'IndexError: list assignment index out of range
append to add an element to the end of a listcharacters = []
print characters
for c in 'aeiou':
characters.append(c)
print characters
[] ['a'] ['a', 'e'] ['a', 'e', 'i'] ['a', 'e', 'i', 'o'] ['a', 'e', 'i', 'o', 'u']
element = 'carbon' mass = '14' print element + '-' + mass lanthanides = ['Ce', 'Pr', 'Nd'] actinides = ['Th', 'Pa', 'U'] all = lanthanides + actinides print all
carbon-14 ['Ce', 'Pr', 'Nd', 'Th', 'Pa', 'U']
list(text) creates a list whose elements are the characters of the string textwater = 'H2O' print 'before conversion:', water water = list(water) print 'after conversion:', water
before conversion: H2O after conversion: ['H', '2', 'O']
del deletes a list elementorganics = ['H', 'C', 'O', 'N'] print 'original:', organics del organics[2] print 'after deleting item 2:', organics del organics[-2:] print 'after deleting the last two remaining items:', organics
original: ['H', 'C', 'O', 'N'] after deleting item 2: ['H', 'C', 'N'] after deleting the last two remaining items: ['H']
organics = ['H', 'C', 'O', 'N'] print 'original:', organics del organics[1:-1] print 'after deleting the middle:', organics
original: ['H', 'C', 'O', 'N'] after deleting the middle: ['H', 'N']
del is a statement, not an operatormetals is initially ['gold', 'iron', 'lead', 'gold']| Method | Purpose | Example | Result |
|---|---|---|---|
append | Add to the end of the list. | metals.append('tin') | ['gold', 'iron', 'lead', 'gold', 'tin'] |
count | Count how many times something appears in the list. | metals.count('gold') | 2 |
find | Find the first occurrence of something in the list. | metals.find('iron') | 1 |
metals.find('sulfur') | -1 | ||
insert | Insert something into the list. | metals.insert(2, 'silver') | ['gold', 'iron', 'silver', 'lead', 'gold'] |
remove | Remove the first occurrence of something from the list. | metals.remove('gold') | ['iron', 'lead', 'gold'] |
reverse | Reverse the list in place. | metals.reverse() | ['gold', 'lead', 'iron', 'gold'] |
sort | Sort the list in place. | metals.sort() | ['gold', 'gold', 'iron', 'lead'] |
| Table 8.2: List Methods | |||
index reports an error if the item can't be foundreverse and sort change the list, and return NoneFalsex = x.reverse() is a common errorx, but then sets x to None, so all data is lostfor loops over the content of a collection (such as a string or list)for c in some_string assigns c each character of some_stringfor v in some_list assigns v each value of some_listfor c in 'lead':
print '/' + c + '/',
print
for v in ['he', 'ar', 'ne', 'kr']:
print v.capitalize()
/l/ /e/ /a/ /d/ He Ar Ne Kr
range creates the list [start, start+1, ..., end-1]end-1 to be consistent with x[start:end]print 'up to 5:', range(5) print '2 to 5:', range(2, 5) print '2 to 10 by 2:', range(2, 10, 2) print '10 to 2:', range(10, 2) print '10 to 2 by -2:', range(10, 2, -2)
up to 5: [0, 1, 2, 3, 4] 2 to 5: [2, 3, 4] 2 to 10 by 2: [2, 4, 6, 8] 10 to 2: [] 10 to 2 by -2: [10, 8, 6, 4]
range(end) and range(start, end, step)range may generate an empty listfor i in range(N)for i in range(len(sequence))element = 'sulfur'
for i in range(len(element)):
print i, element[i]
0 s 1 u 2 l 3 f 4 u 5 r
x in c works element-by-element on lists3 in [1, 2, 3, 4] is True[2, 3] in [1, 2, 3, 4] is False![[Line Segment]](./img/py02/line_segment.png)
Figure 8.3: Line Segment
elements = [['H', 'Li', 'Na'], ['F', 'Cl']] print 'first item in outer list:', elements[0] print 'second item of second sublist:', elements[1][1]
first item in outer list: ['H', 'Li', 'Na'] second item of second sublist: Cl
elements = [['H', 'Li'], ['F', 'Cl']] gases = elements[1] print 'before' print 'elements:', elements print 'gases:', gases gases[1] = 'Br' print 'after' print 'elements:', elements
before elements: [['H', 'Li'], ['F', 'Cl']] gases: ['F', 'Cl'] after elements: [['H', 'Li'], ['F', 'Br']]
![[Aliasing In Action]](./img/py02/aliasing.png)
Figure 8.4: Aliasing In Action
metals = ['Cr', 'Mn', 'Fe', 'Co', 'Ni', 'Cu', 'Zn'] middle = metals[2:-2] print 'before' print 'metals:', metals print 'middle:', middle middle[0] = 'Al' del middle[1] print 'after' print 'metals:', metals print 'middle:', middle
before metals: ['Cr', 'Mn', 'Fe', 'Co', 'Ni', 'Cu', 'Zn'] middle: ['Fe', 'Co', 'Ni'] after metals: ['Cr', 'Mn', 'Fe', 'Co', 'Ni', 'Cu', 'Zn'] middle: ['Al', 'Ni']
![[Slicing Lists]](./img/py02/slice_copy.png)
Figure 8.5: Slicing Lists
(1, 2, 3) instead of [1, 2, 3]()(55,)(55) has to be just the integer 55, or the mathematicians will get upset1, 2, 3 is the same as (1, 2, 3)left, right = "gold", "lead" assigns "gold" to left, and "lead" to rightleft, middle, right = ["gold", "iron", "lead"] worksleft, right = ["gold", "iron", "lead"] doesn'tleft, right = right, left does a safe swapfor loops to unpack structures on the flyelements = [
['H', 'hydrogen', 1.008],
['He', 'helium', 4.003],
['Li', 'lithium', 6.941],
['Be', 'beryllium', 9.012]
]
for (symbol, name, weight) in elements:
print name + ' (' + symbol + '): ' + str(weight)
hydrogen (H): 1.008 helium (He): 4.003 lithium (Li): 6.941 beryllium (Be): 9.012
open to open a file"r" (for read) or "w" for writeinput_file = open('count_bytes.py', 'r')
content = input_file.read()
input_file.close()
print len(content), 'bytes in file'
121 bytes in file
| Method | Purpose | Example |
|---|---|---|
close | Close the file; no more reading or writing is allowed | input_file.close() |
read | Read N bytes from the file, returning the empty string if the file is empty. | next_block = input_file.read(1024) |
If N is not given, read the rest of the file. | rest = input_file.read() | |
readline | Read the next line of text from the file, returning the empty string if the file is empty. | line = input_file.readline() |
readlines | Return the remaining lines in the file as a list, or an empty list at the end of the file. | rest = input_file.readlines() |
write | Write a string to a file. | output_file.write("Element 8: Oxygen") |
write does not automatically append a newline. | ||
writelines | Write each string in a list to a file (without appending newlines). | output_file.writelines(["H", "He", "Li"]) |
| Table 8.3: File Methods | ||
input_file = open('file.txt', 'r')
output_file = open('copy.txt', 'w')
line = input_file.readline()
while line:
output_file.write(line)
line = input_file.readline()
input_file.close()
output_file.close()
file.txt for reading, and assigns the file object to input_filecopy.txt for writing, and assigns the file object to output_fileinput_fileline is assigned the empty stringoutput_fileinput_file = open('count_lines.py', 'r')
count = 0
for line in input_file:
count += 1
input_file.close()
print count, 'lines in file'
6 lines in file
input_file = open('file.txt', 'r')
lines = input_file.readlines()
input_file.close()
output_file = open('copy.txt', 'w')
output_file.writelines(lines)
output_file.close()
input_file = open('file.txt', 'r')
output_file = open('copy.txt', 'w')
for line in input_file:
line = line.rstrip()
print >> output_file, line
input_file.close()
output_file.close()
print >> file sends print's output to a fileExercise 8.1:
What does "aaaaa".count("aaa") return? Why?
Exercise 8.2:
What do each of the following five code fragments do? Why?
x = ['a', 'b', 'c', 'd'] x[0:2] = [] |
x = ['a', 'b', 'c', 'd'] x[0:2] = ['q'] |
x = ['a', 'b', 'c', 'd'] x[0:2] = 'q' |
x = ['a', 'b', 'c', 'd'] x[0:2] = 99 |
x = ['a', 'b', 'c', 'd'] x[0:2] = [99] |
Exercise 8.3:
What does 'a'.join(['b', 'c', 'd']) return? If you have
a list of strings, how can you concatenate them in a single statement?
Why do you think join is written this way, rather than as
['b', 'c', 'd'].join('a')?
| prev | top | next |
| prev | next |
import doessys, math, and os librariesdefdef double(x):
return x * 2
print double(5)
print double(['basalt', 'granite'])
10 ['basalt', 'granite', 'basalt', 'granite']
returndef sign(x):
if x < 0:
return -1
if x == 0:
return 0
return 1
return statements scattered through them are hard to understandreturn statements return Nonereturn on its own is the same as return Nonedef hello():
print 'HELLO'
def world():
print 'WORLD'
return
print hello()
print world()
HELLO None WORLD None
None, an integer, or a list, the caller will have to write an if statement# Global variable.
rock_type = 'unknown'
# Function that creates local variable.
def classify(rock_name):
if rock_name in ['basalt', 'granite']:
rock_type = 'igneous'
elif rock_name in ['sandstone', 'shale']:
rock_type = 'sedimentary'
else:
rock_type = 'metamorphic'
print 'in function, rock_type is', rock_type
# Call the function to prove that it uses its local 'x'.
print "before function, rock_type is", rock_type
classify('sandstone')
print "after function, rock_type is", rock_type
before function, rock_type is unknown in function, rock_type is sedimentary after function, rock_type is unknown
![[Call Stack]](./img/py03/call_stack.png)
Figure 9.1: Call Stack
def add_salt(first, second):
first += "salt"
second += ["salt"]
str = "rock"
seq = ["gneiss", "shale"]
print "before"
print "str is:", str
print "seq is:", seq
add_salt(str, seq)
print "after"
print "str is:", str
print "seq is:", seq
before str is: rock seq is: ['gneiss', 'shale'] after str is: rock seq is: ['gneiss', 'shale', 'salt']
![[Parameter Passing]](./img/py03/parameter_passing.png)
Figure 9.2: Parameter Passing
values[:] is the same as values[0:len(values)]…values that includes the entire list…def add_salt(first, second):
first += "salt"
second += ["salt"]
str = "rock"
seq = ["gneiss", "shale"]
print "before"
print "str is:", str
print "seq is:", seq
add_salt(str, seq[:])
print "after"
print "str is:", str
print "seq is:", seq
before str is: rock seq is: ['gneiss', 'shale'] after str is: rock seq is: ['gneiss', 'shale']
![[Passing Slices]](./img/py03/passing_slices.png)
Figure 9.3: Passing Slices
def total(values, start=0, end=None):
# If no values given, total is zero.
if not values:
return 0
# If no end specified, use the entire sequence.
if end is None:
end = len(values)
# Calculate.
result = 0
for i in range(start, end):
result += values[i]
return resultnumbers = [10, 20, 30] print "numbers being added:", numbers print "total(numbers, 0, 3):", total(numbers, 0, 3) print "total(numbers, 2):", total(numbers, 2) print "total(numbers):", total(numbers)
numbers being added: [10, 20, 30] total(numbers, 0, 3): 60 total(numbers, 2): 30 total(numbers): 60
def is just a shorthand for “create a function, and assign it to a variable”def circumference(r):
return 2 * 3.14159 * r
circ = circumference
print 'circumference(1.0):', circumference(1.0)
print 'circ(2.0):', circ(2.0)
circumference(1.0): 6.28318 circ(2.0): 12.56636
![[Functions As Objects]](./img/py03/function_objects.png)
Figure 9.4: Functions As Objects
def apply_to_list(function, values):
result = []
for v in values:
temp = function(v)
result.append(temp)
return result
radii = [0.1, 1.0, 10.0]
print apply_to_list(circumference, radii)
[0.62831800000000004, 6.2831799999999998, 62.831800000000001]
def area(r):
return 3.14159 * r * r
def color(r):
return "unknown"
def apply_each(functions, value):
result = []
for f in functions:
temp = f(value)
result.append(temp)
return result
functions = [circumference, area, color]
print apply_each(functions, 1.0)
[6.2831799999999998, 3.1415899999999999, 'unknown']
__name__def sedimentary(rock_name):
return rock_name in ['sandstone', 'shale']
sed = sedimentary
print 'original name:', sedimentary.__name__
print 'name of alias:', sed.__name__
original name: sedimentary name of alias: sedimentary
geology.thinggeology.pydef rock_type(rock_name):
if rock_name in ['basalt', 'granite']:
return 'igneous'
elif rock_name in ['sandstone', 'shale']:
return 'sedimentary'
else:
return 'metamorphic'
analysis.pyimport geology
for r in ['granite', 'gneiss']:
print r, 'is', geology.rock_type(r)
analysis.py runs, it prints thisgranite is igneous gneiss is metamorphic
outer.pymanager = "Albus Dumbledore" import inner print "outer:", manager print "inner:", inner.get_manager()
inner.pymanager = "Lucius Malfoy"
def get_manager():
return manager
outer.py produces this:outer: Albus Dumbledore inner: Lucius Malfoy
import geology as g, then call g.print_version()from geology import print_version, then call print_version()from geology import * imports everything from geologyimport is a statementdef are statementsgeology.pyprint 'loading geology module'
def rock_type(rock_name):
if rock_name in ['basalt', 'granite']:
return 'igneous'
elif rock_name in ['sandstone', 'shale']:
return 'sedimentary'
else:
return 'metamorphic'
print 'geology module loaded'
analysis.py:loading geology module geology module loaded granite is igneous gneiss is metamorphic
__name__ is set to:"__main__", if it is the main programdef is_rock(name):
return name in ['basalt', 'granite', 'sandstone', 'shale']
if __name__ == '__main__':
tests = [['basalt', True], ['gingerale', False],
[12345678, False], ['sandstone', True]]
for (value, expected) in tests:
actual = is_rock(value)
if actual == expected:
print 'pass'
else:
print 'fail'
$ python self_test.pypasspasspasspass$ python>>> import self_test>>> self_test.is_rock('sugar')False
sys| Type | Name | Purpose | Example | Result |
|---|---|---|---|---|
| Data | argv | The program's command line arguments | sys.argv[0] | "myscript.py" (or whatever your program is called) |
maxint | Largest positive value that can be represented by Python's basic integer type | sys.maxint | 2147483647 | |
path | List of directories that Python searches when importing modules | sys.path | ['/home/greg/pylib', '/Python24/lib', '/Python24/lib/site-packages'] | |
platform | What type of operating system Python is running on | sys.platform | "win32" | |
stdin | Standard input | sys.stdin.readline() | (Typically) the next line of input from the keyboard | |
stdout | Standard output | sys.stdout.write('****') | (Typically) print four stars to the screen | |
stderr | Standard error | sys.stderr.write('Program crashing!\n') | Print an error message to the screen | |
version | What version of Python this is | sys.version | "2.4 (#60, Feb 9 2005, 19:03:27) [MSC v.1310 32 bit (Intel)]" | |
| Function | exit | Exit from Python, returning a status code to the operating system | sys.exit(0) | Terminates program with status 0 |
| Table 9.1: The Python Runtime System Library | ||||
sys.argv contains the program's command-line argumentssys.argv[0]import sys
for i in range(len(sys.argv)):
print i, sys.argv[i]
$ python command_line.py0 command_line.py$ python command_line.py first second0 command_line.py1 first2 second
sys.stdin and sys.stdout are standard input and outputsys.stderr is connected to standard errorimport sys
count = 0
for line in sys.stdin.readlines():
count += 1
sys.stdout.write('read ' + str(count) + ' lines')
$ python standard_io.py < standard_io.py$ read 7 lines
sys.path is the list of places Python is allowed to look to find modules for importPYTHONPATH environment variablesys.path is ['/home/swc/lib', '/Python24/lib'], then import geology will try:./geology.py/home/swc/lib/geology.py/Python24/lib/geology.pysys.exit terminates the programsys.exit(1) or something similar so that the operating system knows something's gone wrongmath library| Type | Name | Purpose | Example | Result |
|---|---|---|---|---|
| Constant | e | Constant | e | 2.71828… |
pi | Constant | pi | 3.14159… | |
| Function | ceil | Ceiling | ceil(2.5) | 3.0 |
floor | Floor | floor(-2.5) | -3.0 | |
exp | Exponential | exp(1.0) | 2.71828… | |
log | Logarithm | log(4.0) | 1.38629… | |
log(4.0, 2.0) | 2.0 | |||
log10 | Base-10 logarithm | log10(4.0) | 0.60205… | |
pow | Power | pow(2.5, 2.0) | 6.25 | |
sqrt | Square root | sqrt(9.0) | 3.0 | |
cos | Cosine | cos(pi) | -1.0 | |
asin | Arc sine | asin(-1.0) | -1.5707… | |
hypot | Euclidean norm x2 + y2 | hypot(2, 3) | 3.60555… | |
degrees | Convert from radians to degrees | degrees(pi) | 180 | |
radians | Convert from degrees to radians | radians(45) | 0.78539… | |
| Table 9.2: The Python Math Library | ||||
os module is an interface between Python and the operating system| Type | Name | Purpose | Example | Result |
|---|---|---|---|---|
| Constant | curdir | The symbolic name for the current directory. | os.curdir | . on Linux or Windows. |
pardir | The symbolic name for the parent directory. | os.pardir | .. on Linux or Windows. | |
sep | The separator character used in paths. | os.sep | / on Linux, \ on Windows. | |
linesep | The end-of-line marker used in text files. | os.linesep | \n on Linux, \r\n on Windows. | |
| Function | listdir | List the contents of a directory. | os.listdir('/tmp') | The names of all the files and directories in /tmp (except . and ..). |
mkdir | Create a new directory. | os.mkdir('/tmp/scratch') | Make the directory /tmp/scratch. Use os.makedirs to make several directories at once. | |
remove | Delete a file. | os.remove('/tmp/workingfile.txt') | Delete the file /tmp/workingfile.txt. | |
rename | Rename (or move) a file or directory. | os.rename('/tmp/scratch.txt', '/home/swc/data/important.txt') | Move the file /tmp/scratch.txt to /home/swc/data/important.txt. | |
rmdir | Remove a directory. | os.rmdir('/home/swc') | Probably not something you want to do… Use os.removedirs to remove several directories at once. | |
stat | Get information about a file or directory. | os.stat('/home/swc/data/important.txt') | Find out when important.txt was created, how large it is, etc. | |
| Table 9.3: The Python Operating System Library | ||||
import sys, os print 'initial working directory:', os.getcwd() os.chdir(sys.argv[1]) print 'moved to:', os.getcwd() print 'contents:', os.listdir(os.curdir)
$ python os_example.py ~/swcinitial working directory: /home/dmalfoy/swc/lec/inc/py03moved to: /home/dmalfoy/swccontents: ['.svn', 'conf', 'config.mk', 'data', 'depend.mk', 'thesis']
os.stat returns an object whose members have information about a file or directory, including:st_size: size in bytesst_atime: time of most recent accessst_mtime: time of most recent modificationimport sys
import os
for filename in sys.argv[1:]:
status = os.stat(filename)
print filename, status.st_size, status.st_atime
$ python stat_file.py . stat_file.py. 0 1137971715stat_file.py 141 1137971715
os has a submodule called os.path| Type | Name | Purpose | Example | Result |
|---|---|---|---|---|
| Function | abspath | Create normalized absolute pathnames. | os.path.abspath('../jeevan/bin/script.py') | /home/jeevan/bin/script.py (if executed in /home/gvwilson) |
basename | Return the last portion of a path (i.e., the filename, or the last directory name). | os.path.basename('/tmp/scratch/junk.data') | junk.data | |
dirname | Return all but the last portion of a path. | os.path.dirname('/tmp/scratch/junk.data') | /tmp/scratch | |
exists | Return True if a pathname refers to an existing file or directory. | os.path.exists('./scribble.txt') | True if there is a file called scribble.txt in the current working directory, False otherwise. | |
getatime | Get the last access time of a file or directory (like os.stat). | os.path.getatime('.') | 1112109573 (which means that the current directory was last read or written at 10:19:33 EST on March 29, 2005). | |
getmtime | Get the last modification time of a file or directory (like os.stat). | os.path.getmtime('.') | 1112109502 (which means that the current directory was last modified 71 seconds before the time shown above). | |
getsize | Get the size of something in bytes (like os.stat). | os.path.getsize('py03.swc') | 29662. | |
isabs | True if its argument is an absolute pathname. | os.path.isabs('tmp/data.txt') | False | |
isfile | True if its argument identifies an existing file. | os.path.isfile('tmp/data.txt') | True if a file called ./tmp/data.txt exists, and False otherwise. | |
isdir | True if its argument identifies an existing directory.. | os.path.isdir('tmp') | True if the current directory has a subdirectory called tmp. | |
join | Join pathname fragments to create a full pathname. | os.path.join('/tmp', 'scratch', 'data.txt') | "/tmp/scratch/data.txt" | |
normpath | Normalize a pathname (i.e., remove redundant slashes, uses of . and .., etc.). | os.path.normpath('tmp/scratch/../other/file.txt') | "tmp/other/file.txt" | |
split | Return both of the values returned by os.path.dirname and os.path.basename. | os.path.split('/tmp/scratch.dat') | ('/tmp', 'scratch.dat') | |
splitext | Split a path into two pieces root and ext, such that ext is the last piece beginning with a ".". | os.path.splitext('/tmp/scratch.dat') | ('/tmp/scratch', '.dat') | |
| Table 9.4: The Python Pathname Library | ||||
import os
print 'does /home/swc exist?', os.path.exists('/home/swc')
print 'is it a directory?', os.path.isdir('/home/swc')
print 'what is its configuration directory?', os.path.join('/home/swc', 'conf')
print 'where is the configuration file?', os.path.split('/home/swc/conf/current.conf')
does /home/swc exist? True
is it a directory? True
what is its configuration directory? /home/swc\conf
where is the configuration file? ('/home/swc/conf', 'current.conf')
Exercise 9.1:
Write a function that takes two strings called text and
fragment as arguments, and returns the number of times
fragment appears in the second half of text. Your
function must not create a copy of the second half of
text. (Hint: read the documentation for string.count.)
Exercise 9.2:
What does the Python keyword global do?
What are some reasons not to write code that uses it?
Exercise 9.3:
Python allows you to import all the functions and variables in a
module at once, making them local name. For example, if the
module is called values, and contains a variable called
Threshold and a function called limit, then after
the statement from values import *, you can then refer
directly to Threshold and limit, rather than having
to use values.Threshold or values.limit. Explain
why this is generally considered a bad thing to do, even though it
reduces the amount programmers have to type.
Exercise 9.4:
sys.stdin, sys.stdout, and sys.stderr are
variables, which means that you can assign to them. For example,
if you want to change where print sends its output, you can
do this:
import sys
print 'this goes to stdout'
temp = sys.stdout
sys.stdout = open('temporary.txt', 'w')
print 'this goes to temporary.txt'
sys.stdout = temp
Do you think this is a good programming practice? When and why do you think its use might be justified?
Exercise 9.5:
os.stat(path) returns an object whose members describe
various properties of the file or directory identified by
path. Using this, write a function that will determine
whether or not a file is more than one year old.
Exercise 9.6:
Write a Python program that takes as its arguments two years (such as 1997 and 2007), prints out the number of days between the 15th of each month from January of the first year until December of the last year.
Exercise 9.7:
Write a simple version of which in Python.
Your program should check each directory on the caller's path
(in order) to find an executable program that has the name given
to it on the command line.
Exercise 9.8:
In the default parameter value example, why does total
use a default value of None for end, rather than
an integer such as 0 or -1?
Exercise 9.9:
What does the * in front of the parameter extras
mean in the following code example?
def total(*extras):
result = 0
for e in extras:
result += e
return resultHint: look at the following three examples:
print total() print total(19) print total(2, 3, 5)
Exercise 9.10:
Use the os.path, stat, and time modules
to write a program that finds all files in a directory whose
names end with a specific suffix, and which are more than a
certain number of days old. For example, if your program is run
as oldfiles /tmp .backup 10, it will print a list of
all files in the /tmp directory whose names end in
.backup that are more than 10 days old.
Exercise 9.11:
The Strings, Lists, and Files ended by
showing several different ways to copy files using Python. Read
the documentation for the shutil module, and see if
there's a simpler way.
Exercise 9.12:
Consider the short program shown below:
def add_and_max(new_value, collection=[]):
collection.append(new_value)
return max(collection)
print 'first call:', add_and_max(22)
print 'second call:', add_and_max(9)
print 'third call:', add_and_max(15)
What do you expect its output to be? What is its actual output? Why?
| prev | top | next |
| prev | next |
![[Chunking in Short-Term Memory]](./img/style/castling_chunked.png)
Figure 10.1: Chunking in Short-Term Memory
![[Actual Chess Position]](./img/style/chess_actual.png)
Figure 10.2: Actual Chess Position
![[Retention of Actual Chess Position]](./img/style/chess_actual_graph.png)
Figure 10.3: Retention of Actual Chess Position
![[Random Chess Position]](./img/style/chess_random.png)
Figure 10.4: Random Chess Position
![[Retention of Random Chess Position]](./img/style/chess_random_graph.png)
Figure 10.5: Retention of Random Chess Position
PEP-008: Python Style Guide| Rule | Good | Bad |
|---|---|---|
| No whitespace immediately inside parentheses | max(candidates[sublist]) | max( candidates[ sublist ] ) |
| …or before the parenthesis starting indexing or slicing | max (candidates [sublist] ) | |
| No whitespace immediately before comma or colon | if limit > 0: print minimum, limit | if limit > 0 : print minimum , limit |
| Use space around arithmetic and in-place operators | x += 3 * 5 | x+=3*5 |
| No spaces when specifying default parameter values | def integrate(func, start=0.0, interval=1.0) | def integrate(func, start = 0.0, interval = 1.0) |
Never use names that are distinguished only by "l", "1", "0", or "O" | tempo_long and tempo_init | tempo_l and tempo_1 |
| Short lower-case names for modules (i.e., files) | geology | Geology or geology_package |
| Upper case with underscores for constants | TOLERANCE or MAX_AREA | Tolerance or MaxArea |
| Camel case for class names | SingleVariableIntegrator | single_variable_integrator |
| Lowercase with underscores for function and method names | divide_region | divRegion |
| …and member variables | max_so_far | maxSoFar |
Use is and is not when comparing to special values | if current is not None: | if current != None: |
Use isinstance when checking types | if isinstance(current, Rock): | if type(current) == Rock: |
| Table 10.1: Basic Python Style Rules | ||
temperature shouldn't be used to store the number of pottery shards found at a dig sitecurrent_surface_temperature_of_probe is meaningful, but not readablecstp is easier to read, but hard to understand…ctspcurr_ave_temp instead of current_average_temperature is OK…curnt_av_tmpi and j for indices in tightly-nested for loopsExperimentalRecord, rather than ER or ExpRecimport sys, os
import reader, splitter, transpose
a=[]
b=[]
c=[]
d=sys.argv[1]
a=reader.rdlines(d)
b=splitter.splitsec(a)
c=d.split('.')
for i in range(len(b)):
if os.path.isfile('%s.%d.dat'%(c[0],i+1)):
print '%s.%d.dat already exists!'%(c[0],i+1)
break
else:
output=file('%s.%d.dat'%(c[0],i+1),'w')
print>>output,transpose.txpose(b[i])
output.close()
import sys, os
import reader, splitter, transpose
input_file_name = sys.argv[1]
lines = reader.read_lines_from_file(input_file_name)
sections = splitter.split_into_sections(lines)
file_name_stem = input_file_name.split('.')[0]
for i in range(len(sections)):
output_file_name = '%s.%d.dat' % (file_name_stem, i+1)
if os.path.isfile(output_file_name):
print '%s already exists!' % output_file_name
break
else:
output = file(output_file_name, 'w')
print >> output, transpose.transpose(sections[i])
output.close()
# What's missing, and what's extra?
def diff_filelist(dir_path, manifest,
ignore=[os.curdir, os.pardir, '.svn']):
def show_diff(title, diff):
if diff:
print title
for d in diff:
print '\t' + d
expected = Set()
inf = open(manifest, 'r')
for line in inf:
expected.add(line.strip())
inf.close()
actual = Set()
contents = os.listdir(dir_path)
for c in contents:
if c not in ignore:
actual.add(c)
show_diff('missing:', expected - actual)
show_diff('surplus:', actual - expected)dir_path suggests “directory path”diff if it isn't emptymanifest filedir_pathif __name__ == '__main__':
if len(sys.argv) != 3:
print >> sys.stderr, "usage: diff_filelist directory_path manifest_file"
sys.exit(1)
diff_filelist(sys.argv[1], sys.argv[2])grep or “Find in Files” to search for othersfor line in input:break to handle end-of-inputcount = 0
while 1:
line = infile.readline()
if not line:
break
count += 1
1 instead of True because older Pythons didn't define TruePyLint parses programs to create an abstract syntax tree![[Abstract Syntax Tree]](./img/style/annotated_syntax_tree.png)
Figure 10.6: Abstract Syntax Tree
PyChecker imports the module (or modules)PyChecker can't analyze it$Revision: 421$ when you submit changes__version____version__ = "$Revision: 423$"
if __name__ == '__main__':
print __version__
biomes.dat has a header $Revision: 421$ecoanalyzer.py# $Revision: $ # From: biomes.dat version 421 # By: ecoanalyzer.py version 37 # Parameters: sliding_average 20 trim False # On: 2006-02-22 12:14:07 EST
$Revision:$ header will be expanded when the file is first checked in/**
* Returns the least common ancestor of two species based on DNA
* comparison, with certainty no less than the specified threshold.
* Note that getConcestor(X, X, t) returns X for any threshold.
*
* @param left one of the base species for the search
* @param right the other base species for the search
* @param threshold the degree of certainty required
* @return the common ancestor, or null if none is found
* @see Species
*/
public Species getConcestor(Species left, Species right, float threshold) {
...implementation...
}
getConcestor
public Species getConcestor(Species left, Species right, float threshold)
Returns the least common ancestor of two species based on DNA comparison, with certainty no less than the specified threshold. Note that getConcestor(X, X, t) returns X for any threshold.
Parameters:
left- one of the base species for the search
right- the other base species for the search
threshold- the degree of certainty requiredParameters:
the common ancestor, or null if none is found
See Also:
Image
__doc__ attribute'''This module provides functions that search and compare genomes.
All functions assume that their input arguments are in valid CCSN-2
format; unless specifically noted, they do not modify their arguments,
print, or have other side effects.
'''
__version__ = '$Revision: 497$'
def get_concestor(left, right, threshold):
'''Find the least common ancestor of two species.
This function searches for a least common ancestor based on DNA
comparison with certainty no less than the specified threshold.
If one can be found, it is returned; otherwise, the function
returns None. get_concestor(X, X, t) returns X for any threshold.
left : one of the base species for the search
right : the other base species for the search
threshold : the degree of certainty required
'''
pass # implementation would go here
$ python>>> import genome>>> print genome.__doc__This module provides functions that search and compare genomes.All functions assume that their input arguments are in valid CCSN-2format; unless specifically noted, they do not modify their arguments,print, or have other side effects.>>> print genome.get_concestor.__doc__Find the least common ancestor of two species.This function searches for a least common ancestor based on DNAcomparison with certainty no less than the specified threshold.If one can be found, it is returned; otherwise, the functionreturns None. get_concestor(X, X, t) returns X for any threshold.left : one of the base species for the searchright : the other base species for the searchthreshold : the degree of certainty required
Docutils will extract, format, and cross-reference docstrings| prev | top | next |
| prev | next |
True if the first is greater than the secondstring.startswithTrue if the string starts with the given prefix, and False otherwiseTests = [
# String Prefix Expected
['a', 'a', True],
['a', 'b', False],
['abc', 'a', True],
['abc', 'ab', True],
['abc', 'abc', True],
['abc', 'abcd', False],
['abc', '', True]
]passes = 0
failures = 0
for (s, p, expected) in Tests:
actual = s.startswith(p)
if actual == expected:
passes += 1
else:
failures += 1
print 'passed', passes, 'out of', passes+failures, 'tests'if/elsetry blockexcept blocktry block, Python raises an exceptionexceptelse blocktry blockfor num in [-1, 0, 1]:
try:
inverse = 1/num
except:
print 'inverting', num, 'caused error'
else:
print 'inverse of', num, 'is', inverse
inverse of -1 is -1 inverting 0 caused error inverse of 1 is 1
![[Flow of Control in Try/Except/Else]](./img/qa/try_except_else.png)
Figure 11.1: Flow of Control in Try/Except/Else
except statement# Note: mix of numeric and non-numeric values.
values = [0, 1, 'momentum']
# Note: top index will be out of bounds.
for i in range(4):
try:
print 'dividing by value', i
x = 1.0 / values[i]
print 'result is', x
except ZeroDivisionError, e:
print 'divide by zero:', e
except IndexError, e:
print 'index error:', e
except:
print 'some other error:', e
dividing by value 0 divide by zero: float division dividing by value 1 result is 1.0 dividing by value 2 some other error: float division dividing by value 3 index error: list index out of range
except blocks are tested in order—whichever matches first, winsexcept appears, it must come last (since it catches everything)except Exception, e so that you have the exception objectZeroDivisionError, OverflowError, and FloatingPointError are all types of ArithmeticError| Name | Purpose | ||
|---|---|---|---|
Exception | Root of exception hierarchy | ||
ArithmeticError | Illegal arithmetic operation | ||
FloatingPointError | Generic error in floating point calculation | ||
OverflowError | Result too large to represent | ||
ZeroDivisionError | Attempt to divide by zero | ||
IndexError | Bad index to sequence (out of bounds or illegal type) | ||
TypeError | Illegal type (e.g., trying to add integer and string) | ||
ValueError | Illegal value (e.g., math.sqrt(-1)) | ||
EnvironmentError | Error interacting with the outside world | ||
IOError | Unable to create or open file, read data, etc. | ||
OSError | No permissions, no such device, etc. | ||
| Table 11.1: Common Exception Types in Python | |||
try/except block, it pushes the except handlers on a stack![[Stacking Exception Handlers]](./img/qa/exception_stack.png)
Figure 11.2: Stacking Exception Handlers
def invert(vals, index):
try:
vals[index] = 10.0/vals[index]
except ArithmeticError, e:
print 'inner exception handler:', e
def each(vals, indices):
try:
for i in indices:
invert(vals, i)
except IndexError, e:
print 'outer exception handler:', e
# Once again, the top index will be out of bounds.
values = [-1, 0, 1]
print 'values before:', values
each(values, range(4))
print 'values after:', values
values before: [-1, 0, 1] inner exception handler: float division outer exception handler: list index out of range values after: [-10.0, 0, 10.0]
raise to trigger exception processingraise Exception('this is an error message')for i in range(4):
try:
if (i % 2) == 1:
raise ValueError('index is odd')
else:
print 'not raising exception for %d' % i
except ValueError, e:
print 'caught exception for %d' % i, e
not raising exception for 0 caught exception for 1 index is odd not raising exception for 2 caught exception for 3 index is odd
None, -1, False, or some other valuelist.find breaks this rulestderrtry/exceptTests = [
['a', 'a', False], # wrong expected value
['a', 1, False], # wrong type
['abc', 'a', True] # everything legal
]
passes = failures = errors = 0
for (s, p, expected) in Tests:
try:
actual = s.startswith(p)
if actual == expected:
passes += 1
else:
failures += 1
except:
errors += 1
print 'tests:', passes + failures + errors
print 'passes:', passes
print 'failures:', failures
print 'errors:', errors
tests: 3 passes: 1 failures: 1 errors: 1
Tests = [
[[], [], 'empty list'],
[[1], [1], 'single value'],
[[1, 3], [1, 4], 'two values'],
[[1, 3, 7], [1, 4, 11], 'three values'],
[[-1, 1], [-1, 0], 'negative values'],
[[1, 3.0], [1, 4.0], 'mixed types'],
["string", ValueError, 'non-list input'],
[['a'], ValueError, 'non-numeric value']
]AssertionError exceptiondef find_range(values):
'''Find the non-empty range of values in the input sequence.'''
assert (type(values) is list) and (len(values) > 0)
left = min(values)
right = max(values)
assert (left in values) and (right in values) and (left <= right)
return left, right
left is less than or equal to all other values, or that right is greater than or equal toassert liberallydef can_transmute(element):
'''Can this element be turned into gold?'''
# Bug #172: make sure the input is actually an element.
assert is_valid_element(element)
# Gold is trivial.
if element is Gold:
return True
# Trans-uranic metals and halogens are impossible.
if (element.atomic_number > Uranium.atomic_number) or \
(element in Halogens):
return False
# Look for a sequence of steps that leads to gold.
steps = search_transmutations(element, Gold)
if steps == []:
return False
else:
# Bug #201: must be at least two elements in sequence.
assert len(steps) >= 2
return True
| prev | top | next |
| prev | next |
set type is built in to Python 2.4 and higherset()vowels = set()
for char in 'aieoeiaoaaeieou':
vowels.add(char)
print vowels
Set(['a', 'i', 'e', 'u', 'o'])
| Method | Purpose | Example | Result | Alternative Form |
|---|---|---|---|---|
| Example values: | ten = set(range(10)) | lows = set([0, 1, 2, 3, 4]) | odds = set([1, 3, 5, 7, 9]) | |
add | Add an element to a set | lows.add(9) | None | lows is now set([0, 1, 2, 3, 4, 9]]) |
clear | Remove all elements from the set | lows.clear() | None | lows is now set() |
difference | Create a set with elements that are in one set, but not the other | lows.difference(odds) | set([0, 2, 4]]) | lows - odds |
intersection | Create a set with elements that are in both arguments | lows.intersection(odds) | set([1, 3]]) | lows & odds |
issubset | Are all of one set's elements contained in another? | lows.issubset(ten) | True | lows <= ten |
issuperset | Does one set contain all of another's elements? | lows.issuperset(odds) | False | lows >= odds |
remove | Remove an element from a set | lows.remove(0) | None | lows is now set([1, 2, 3, 4]]) |
symmetric_difference | Create a set with elements that are in exactly one set | lows.symmetric_difference(odds) | set([0, 2, 4, 5, 7, 9]]) | lows ^ odds |
union | Create a set with elements that are in either argument | lows.union(odds) | set([0, 1, 2, 3, 4, 5, 7, 9]]) | lows | odds |
| Table 12.1: Set Methods and Operators | ||||
lines = [
'canada goose', 'canada goose', 'long-tailed jaeger', 'canada goose',
'snow goose', 'canada goose', 'canada goose', 'northern fulmar'
]
seen = set()
for line in lines:
seen.add(line.strip())
for bird in seen:
print bird
northern fulmar snow goose long-tailed jaeger canada goose
for loops over the values in the set![[Hashing]](./img/py04/hashing.png)
Figure 12.1: Hashing
![[Misplaced Values]](./img/py04/misplaced_values.png)
Figure 12.2: Misplaced Values
values = set()
values.add('birds')
print values
values.add(('Canada', 'goose'))
print values
values.add(['snow', 'goose'])
print values
Traceback (most recent call last):
File "mutable_in_set.py", line 8, in ?
values.add(['snow', 'goose'])
File "/usr/lib/python2.3/sets.py", line 521, in add
self._data[element] = True
TypeError: list objects are unhashable
("snow", "goose")$ python>>> birds = set() >>> arctic = frozenset(['goose', 'tern']) >>> birds.add(arctic) >>> print birdsset([frozenset(['goose', 'tern'])])>>> arctic.add('eider')AttributeError: 'frozenset' object has no attribute 'add'
if name in seen check requires N/2 comparisons on average![[List vs. Set Performance]](./img/py04/list_vs_set.png)
Figure 12.3: List vs. Set Performance
![[Binary Search]](./img/py04/binary_search.png)
Figure 12.4: Binary Search
![[List vs. Set Performance Revisited]](./img/py04/logarithmic.png)
Figure 12.5: List vs. Set Performance Revisited
(name, count) in set…![[Dictionaries as Tables]](./img/py04/dict_as_table.png)
Figure 12.6: Dictionaries as Tables
{}{'Newton':1642, 'Darwin':1809}{}[]birthday = {
'Newton' : 1642,
'Darwin' : 1809
}
print "Darwin's birthday:", birthday['Darwin']
print "Newton's birthday:", birthday['Newton']
Darwin's birthday: 1809 Newton's birthday: 1642
birthday = {
'Newton' : 1642,
'Darwin' : 1809
}
print birthday['Turing']
Traceback (most recent call last):
File "key_error.py", line 5, in ?
print birthday['Turing']
KeyError: 'Turing'
birthday = {}
birthday['Darwin'] = 1809
birthday['Newton'] = 1942 # oops
birthday['Newton'] = 1642
print birthday
{'Darwin': 1809, 'Newton': 1642}
del d[k]birthday = {
'Newton' : 1642,
'Darwin' : 1809,
'Turing' : 1912
}
print 'Before deleting Turing:', birthday
del birthday['Turing']
print 'After deleting Turing:', birthday
del birthday['Faraday']
print 'After deleting Faraday:', birthday
Before deleting Turing: {'Turing': 1912, 'Newton': 1642, 'Darwin': 1809}
After deleting Turing: {'Newton': 1642, 'Darwin': 1809}
Traceback (most recent call last):
File "dict_del.py", line 10, in ?
del birthday['Faraday']
KeyError: 'Faraday'
k is in a dictionary d using k in dbirthday = {
'Newton' : 1642,
'Darwin' : 1809
}
for name in ['Newton', 'Turing']:
if name in birthday:
print name, birthday[name]
else:
print 'Who is', name, '?'
Newton 1642 Who is Turing ?
for k in d loops over the dictionary's keys (rather than its values)for loops over the values, rather than indicesbirthday = {
'Newton' : 1642,
'Darwin' : 1809,
'Turing' : 1912
}
for name in birthday:
print name, birthday[name]
Turing 1912 Newton 1642 Darwin 1809
| Method | Purpose | Example | Result |
|---|---|---|---|
clear | Empty the dictionary. | d.clear() | Returns None, but d is now empty. |
get | Return the value associated with a key, or a default value if the key is not present. | d.get('x', 99) | Returns d['x'] if "x" is in d, or 99 if it is not. |
keys | Return the dictionary's keys as a list. Entries are guaranteed to be unique. | birthday.keys() | ['Turing', 'Newton', 'Darwin'] |
items | Return a list of (key, value) pairs. | birthday.items() | [('Turing', 1912), ('Newton', 1642), ('Darwin', 1809)] |
values | Return the dictionary's values as a list. Entries may or may not be unique. | birthday.values() | [1912, 1642, 1809] |
update | Copy keys and values from one dictionary into another. | See the example below. | |
| Table 12.2: Dictionary Methods in Python | |||
birthday = {
'Newton' : 1642,
'Darwin' : 1809,
'Turing' : 1912
}
print 'keys:', birthday.keys()
print 'values:', birthday.values()
print 'items:', birthday.items()
print 'get:', birthday.get('Curie', 1867)
temp = {
'Curie' : 1867,
'Hopper' : 1906,
'Franklin' : 1920
}
birthday.update(temp)
print 'after update:', birthday
birthday.clear()
print 'after clear:', birthday
keys: ['Turing', 'Newton', 'Darwin']
values: [1912, 1642, 1809]
items: [('Turing', 1912), ('Newton', 1642), ('Darwin', 1809)]
get: 1867
after update: {'Curie': 1867, 'Darwin': 1809, 'Franklin': 1920, 'Turing': 1912, 'Newton': 1642, 'Hopper': 1906}
after clear: {}
# Data to count.
names = ['tern','goose','goose','hawk','tern','goose', 'tern']
# Build a dictionary of frequencies.
freq = {}
for name in names:
# Already seen, so increment count by one.
if name in freq:
freq[name] = freq[name] + 1
# Never seen before, so add to dictionary.
else:
freq[name] = 1
# Display.
print freq
{'goose': 3, 'tern': 3, 'hawk': 1}
dict.getfreq = {}
for name in names:
freq[name] = freq.get(name, 0) + 1
print freq
{'goose': 3, 'tern': 3, 'hawk': 1}
keys = freq.keys()
keys.sort()
for k in keys:
print k, freq[k]
goose 3 hawk 1 tern 3
{'a':1, 'b':1, 'c':1}?dict.get(key, []) instead of dict.get(key, 0)inverse = {}
for (key, value) in freq.items():
seen = inverse.get(value, [])
seen.append(key)
inverse[value] = seen
keys = inverse.keys()
keys.sort()
for k in keys:
print k, inverse[k]
1 ['hawk'] 3 ['goose', 'tern']
![[Inverting a Dictionary]](./img/py04/invert_dict.png)
Figure 12.7: Inverting a Dictionary
inverse = {}
for (key, value) in freq.items():
if value not in inverse:
inverse[value] = []
inverse[value].append(key)"%" can take a dictionary as its right argument"%(varname)s" inside the format string to identify what's to be substitutedbirthday = {
'Newton' : 1642,
'Darwin' : 1809,
'Turing' : 1912
}
entry = '%(name)s: %(year)s'
for (name, year) in birthday.items():
temp = {'name' : name, 'year' : year}
print entry % temp
Turing: 1912 Newton: 1642 Darwin: 1809
def settings(title, **kwargs):
print 'title:', title
for key in kwargs:
print ' %s: %s' % (key, kwargs[key])
settings('nothing extra')
settings('colors', red=0.0, green=0.5, blue=1.0)
title: nothing extra
title: colors
blue: 1.0
green: 0.5
red: 0.0
** in front of kwargs means “Put any extra keyword arguments in a dictionary, and assign it to kwargs“def sum(*values):
result = 0.0
for v in values:
result += v
return result
print "no values:", sum()
print "single value:", sum(3)
print "five values:", sum(3, 4, 5, 6, 7)
no values: 0.0 single value: 3.0 five values: 25.0
* in front of values means “Put any extra unnamed arguments in a tuple, and assign it to values“* argumen per function** mean? How and why would you use it?| prev | top | next |
| prev | next |
WingIDE in this lecture![[A Debugger in Action]](./img/debugging/debugger_in_action.png)
Figure 13.1: A Debugger in Action
![[Source Browser]](./img/debugging/source_browser.png)
Figure 13.2: Source Browser
![[Code Assistant]](./img/debugging/code_assistant.png)
Figure 13.3: Code Assistant
Microsoft Visual Studio on WindowsEclipseGDBpdbpdb.set_trace() inside a programimport pdb base = "Na" pdb.set_trace() acid = "Cl" salt = base + acid print salt
$ python lec/inc/debugging/set_trace.py> /swc/lec/inc/debugging/set_trace.py(7)?()-> acid = "Cl"(Pdb) n> /swc/lec/inc/debugging/set_trace.py(8)?()-> salt = base + acid(Pdb) n> /swc/lec/inc/debugging/set_trace.py(9)?()-> print salt(Pdb) nNaCl--Return--
![[Inspecting Values]](./img/debugging/inspecting_values.png)
Figure 13.4: Inspecting Values
2*x<0, debugger displays False![[Programs As Data]](./img/debugging/programs_as_data.png)
Figure 13.5: Programs As Data
HALT instruction![[Creating a Breakpoint]](./img/debugging/setting_breakpoint.png)
Figure 13.6: Creating a Breakpoint
HALT instruction, it signals the debuggerHALT once againmax_temp to -1)time_spent_waiting to 600 seconds in debugger than to pull out the network cable and wait…NoneDEBUG: only want to see it when debugging a problemINFO: information about normal operationsWARNING: something that a human being should pay attention toERROR: something has gone wrong inside the softwareCRITICAL: something has gone very wrong inside the softwareWARNING-level messages and above in a fileimport logging
logging.basicConfig(level=logging.WARNING,
format='%(asctime)s %(levelname)s %(message)s',
datefmt='%Y-%b-%d %H:%M:%S',
filename='logging_example.out',
filemode='w')
logging.debug('Last file opened: %s', datafile)
logging.info('User %s logged in normally on %s', user_id, machine_name)
logging.warning('%s attempted to log in as %s', villain, user_id)
logging.error('No such spell (spell ID %04d)', spell_id)
logging.critical('Failed to cast %s', curse)
2006-Feb-02 16:19:02 WARNING dmalfoy attempted to log in as hpotter 2006-Feb-02 16:19:02 ERROR No such spell (spell ID 0172) 2006-Feb-02 16:19:02 CRITICAL Failed to cast Confusius
assert to check things that ought to be right| prev | top | next |
| prev | next |
for time in simulation_period:
for thing in world:
if type(thing) is plant:
update_plant(thing, time)
elif type(thing) is fish:
update_fish(thing, time)
elif type(thing) is creepy_crawly:
update_creepy_crawly(thing, time)
# marker:main:vdotsfor time in simulation_period:
for thing in world:
thing.update(time)
![[Memory Model for Classes and Objects]](./img/oop01/classes_and_objects.png)
Figure 14.1: Memory Model for Classes and Objects
class keywordobject in parentheses":" and an indented block containing the class's contentsclass Empty(object):
passpass means “do nothing”, i.e., create an empty classif __name__ == '__main__':
first = Empty()
second = Empty()
print 'first has id', id(first)
print 'second has id', id(second)
first has id 5086860 second has id 5086892
id returns the object's hash codeif __name__ == '__main__':selfthis in C++ and Java, the name is just a conventionobject.method(argument) is equivalent to:C that object is an instance ofC.method(object, argument)class Greeting(object):
def say(self, name):
print 'Hello, %s!' % name
if __name__ == '__main__':
greet = Greeting()
greet.say('object')
Hello, object!
self.x = 3x with the value 3x with the value 3class Point(object):
def set_values(self, x, y):
self.x = x
self.y = y
def get_values(self):
return (self.x, self.y)
def norm(self):
return math.sqrt(self.x ** 2 + self.y ** 2)
if __name__ == '__main__':
p = Point()
p.set_values(1.2, 3.5)
print 'p is', p.get_values()
print 'norm is', p.norm()
p is (1.2, 3.5) norm is 3.7
![[Creating a Simple Point]](./img/oop01/simple_point.png)
Figure 14.2: Creating a Simple Point
p = Point() p.x = 3.5 p.y = 4.25 print 'point is', p.get_values()
point is (3.5, 4.25)
__init__, Python will call it when building new instancesclass Point(object):
def __init__(self, x=0, y=0):
self.reset(x, y)
def reset(self, x, y):
assert (type(x) is int) and (x >= 0), 'x is not non-negative integer'
assert (type(y) is int) and (y >= 0), 'y is not non-negative integer'
self.x = x
self.y = y
def get(self):
return (self.x, self.y)
def norm(self):
return math.sqrt(self.x ** 2 + self.y ** 2)
if __name__ == '__main__':
p = Point(1, 1)
print 'point is initially', p.get()
p.reset(1, 1)
print 'p moved to', p.get()
point is initially (1, 1) p moved to (1, 1)
__init__ is just one example of a special method__str____str__ if it exists, orclass Point(object):
&vdots;
def __str__(self):
return '(%4.2f, %4.2f)' % (self.x, self.y)
if __name__ == '__main__':
p = Point(3, 4)
print 'point is', p
point is (3, 4)
Organism that represents living thingsMammalOrganism's definition and add more members and methodsclass Organism(object):
def __init__(self, common_name, sci_name):
self.common_name = common_name
self.sci_name = sci_name
def get_common_name(self):
return self.common_name
def get_sci_name(self):
return self.sci_name
def __str__(self):
return '%s (%s)' % (self.common_name, self.sci_name)
class Mammal(Organism):
def __init__(self, common_name, sci_name, body_temp, gest_period):
Organism.__init__(self, common_name, sci_name)
self.body_temp = body_temp
self.gest_period = gest_period
def get_body_temp(self):
return self.body_temp
def get_gest_period(self):
return self.gest_period
def __str__(self):
extra = ' %4.2f degrees / %d days' % (self.body_temp, self.gest_period)
return Organism.__str__(self) + extra
if __name__ == '__main__':
creature = Mammal('wolf', 'canis lupus', 38.7, 63)
print creature
wolf (canis lupus) 38.70 degrees / 63 days
![[Memory Model for Inheritance]](./img/oop01/inheritance.png)
Figure 14.3: Memory Model for Inheritance
Mammal's constructor calls Organism's to initialize the organism-ish bits of the objectMammal defines its own __str__ methodOrganismMammal.__str__ calls Organism.__str__ for the same reason that Mammal.__init__ calls Organism.__init__Bird from Organismclass Bird(Organism):
def __init__(self, common_name, sci_name, incubate_period):
Organism.__init__(self, common_name, sci_name)
self.incubate_period = incubate_period
def get_incubate_period(self):
return self.incubate_period
def __str__(self):
extra = ' %d days' % self.incubate_period
return Organism.__str__(self) + extra
if __name__ == '__main__':
creatures = [
Bird('loon', 'gavia immer', 27),
Mammal('grizzly bear', 'ursus arctos horribilis', 38.0, 210)
]
for c in creatures:
print c
loon (gavia immer) 27 days grizzly bear (ursus arctos horribilis) 38.00 degrees / 210 days
class Mineral(object):
def __init__(self, common_name, sci_name, formula):
self.common_name = common_name
self.sci_name = sci_name
self.formula = formula
def get_common_name(self):
return self.common_name
def get_sci_name(self):
return self.sci_name
def __str__(self):
return '%s/%s: %s' % (self.common_name, self.sci_name, self.formula)
if __name__ == '__main__':
things = [
Mammal('arctic hare', 'Lepus arcticus', 40.1, 50),
Mineral("fool's gold", 'iron pyrite', 'FeS2')
]
for t in things:
print t.get_common_name(), 'is', t.get_sci_name()
arctic hare is Lepus arcticus fool's gold is iron pyrite
Child.meth may ignore some of Parent.meth's pre-conditions, but may not impose moreChild.meth accepts everything thatParent.meth did, and possibly moreParent.meth correctly is guaranteed to call Child.meth correctly tooChild.meth must satisfy all the post-conditions of Parent.meth, and may impose moreChild.meth's possible output is a subset of Parent.meth'sParent.meth will still work if given an instance of Child insteadPlant and Animal from OrganismOrganism two methods: can_move and movePlant.can_move() returns FalsePlant.move() raises an exceptionOrganism one method: movePlant.move() does nothingPlant.move implies that plants can do something they can't![[CRC Cards]](./img/oop01/crc.png)
Figure 14.4: CRC Cards
| prev | top | next |
| prev | next |
__init__ and __str__obj has a __len__ method, Python calls it whenever it sees len(obj)class Recent(object):
def __init__(self, number=3):
self.number = number
self.items = []
def __str__(self):
return str(self.items)
def add(self, item):
self.items.append(item)
self.items = self.items[-self.number:]
def __len__(self):
return len(self.items)
if __name__ == '__main__':
history = Recent()
for era in ['Permian', 'Trassic', 'Jurassic', 'Cretaceous', 'Tertiary']:
history.add(era)
print len(history), history
1 ['Permian'] 2 ['Permian', 'Trassic'] 3 ['Permian', 'Trassic', 'Jurassic'] 3 ['Trassic', 'Jurassic', 'Cretaceous'] 3 ['Jurassic', 'Cretaceous', 'Tertiary']
"a + b" is “just” a shorthand for add(a, b)a is an object, for a.add(b)add, Python spells this method __add____add__, it is called whenever something is +'d to the objectx + y calls x.__add__(y)class Recent(object):
def __add__(self, item):
self.items.append(item)
self.items = self.items[-self.number:]
return self
if __name__ == '__main__':
history = Recent()
for era in ['Permian', 'Trassic', 'Jurassic', 'Cretaceous', 'Tertiary']:
history = history + era
print len(history), history
1 ['Permian'] 2 ['Permian', 'Trassic'] 3 ['Permian', 'Trassic', 'Jurassic'] 3 ['Trassic', 'Jurassic', 'Cretaceous'] 3 ['Jurassic', 'Cretaceous', 'Tertiary']
2 + x and x + 2 don't always do the same thing__radd__ instead of __add____add__ method, call that__radd__ method, call that| Method | Purpose |
|---|---|
__lt__(self, other) | Less than comparison; __le__, __ne__, and others are used for less than or equal, not equal, etc. |
__call__(self, args…) | Called for obj(3, "lithium") |
__len__(self) | Object “length” |
__getitem__(self, key) | Called for obj[3.14] |
__setitem__(self, key, value) | Called for obj[3.14] = 2.17 |
__contains__ | Called for "lithium" in obj |
__add__ | Called for obj + value; use __mul__ for obj * value, etc. |
__int__ | Called for int(obj); use __float__ and others to convert to other types |
| Table 15.1: Special Methods | |
v after the following operations?v = SparseVector() # all values initialized to 0.0 v[27] = 1.0 # length is now 28 v[43] = 1.0 # length is now 44 v[43] = 0.0 # is the length still 44, or 28?
__len__, __getitem__, and __setitem__ to make it behave like a listdel sparse[index]class SparseVector(object):
'''Implement a sparse vector. If a value has not been set
explicitly, its value is zero.'''
def __init__(self):
'''Construct a sparse vector with all zero entries.'''
self.data = {}
def __len__(self):
'''The length of a vector is one more than the largest index.'''
if self.data:
return 1 + max(self.data.keys())
return 0
def __getitem__(self, key):
'''Return an explicit value, or 0.0 if none has been set.'''
if key in self.data:
return self.data[key]
return 0.0
def __setitem__(self, key, value):
'''Assign a new value to a vector entry.'''
if type(key) is not int:
raise KeyError, 'non-integer index to sparse vector'
self.data[key] = value"*") is usually called other__rmul__ = __mul__ do the same thing as __rmul__ def __mul__(self, other):
'''Calculate dot product of a sparse vector with something else.'''
result = 0.0
for k in self.data:
result += self.data[k] * other[k]
return result
def __rmul__(self, other):
return self.__mul__(other) def __add__(self, other):
'''Add something to a sparse vector.'''
# Initialize result with all non-zero values from this vector.
result = SparseVector()
result.data.update(self.data)
# If the other object is also a sparse vector, add non-zero values.
if isinstance(other, SparseVector):
for k in other.data:
result[k] = result[k] + other[k]
# Otherwise, use brute force.
else:
for i in range(len(other)):
result[i] = result[i] + other[i]
return result
# Right-hand add does the same thing as left-hand add.
__radd__ = __add__print statements with assertionsif __name__ == '__main__':
x = SparseVector()
x[1] = 1.0
x[3] = 3.0
x[5] = 5.0
print 'len(x)', len(x)
for i in range(len(x)):
print '...', i, x[i]
y = SparseVector()
y[1] = 10.0
y[2] = 20.0
y[3] = 30.0
print 'x + y', x + y
print 'y + x', y + x
print 'x * y', x * y
print 'y * x', y * x
z = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5]
print 'x + z', x + z
print 'x * z', x * z
print 'z + x', z + x
len(x) 6 ... 0 0.0 ... 1 1.0 ... 2 0.0 ... 3 3.0 ... 4 0.0 ... 5 5.0 x + y [0.0, 11.0, 20.0, 33.0, 0.0, 5.0] y + x [0.0, 11.0, 20.0, 33.0, 0.0, 5.0] x * y 100.0 y * x 100.0 x + z [0.0, 1.1, 0.2, 3.3, 0.4, 5.5] x * z 3.5 z + x [0.0, 1.1, 0.2, 3.3, 0.4, 5.5]
class block belong to the class as a wholeclass Counter(object):
num = 0 # Number of Counter objects created.
def __init__(self, name):
Counter.num += 1
self.name = name
if __name__ == '__main__':
print 'initial count', Counter.num
first = Counter('first')
print 'after creating first object', Counter.num
second = Counter('second')
print 'after creating second object', Counter.num
initial count 0 after creating first object 1 after creating second object 2
self parameter@staticmethod in front of itclass Experiment(object):
already_done = {}
@staticmethod
def get_results(name, *params):
if name in Experiment.already_done:
return Experiment.already_done[name]
exp = Experiment(name, *params)
exp.run()
Experiment.already_done[name] = exp
return exp
def __init__(self, name, *params):
self.name = name
self.params = params
def run(self):
# marker:vdots
if __name__ == '__main__':
first = Experiment.get_results('anti-gravity')
second = Experiment.get_results('time travel')
third = Experiment.get_results('anti-gravity')
print 'first ', id(first)
print 'second', id(second)
print 'third ', id(third)
first 5120204 second 5120396 third 5120204
class AntennaClass(object):
'''Singleton that controls a radio telescope.'''
# The unique instance of the class.
instance = None
# The constructor fails if an instance already exists.
def __init__(self, max_rotation):
assert AntennaClass.instance is None, 'Trying to create a second instance!'
self.max_rotation = max_rotation
AntennaClass.instance = self
# Make the creation function look like a class constructor.
def Antenna(max_rotation):
'''Create and store an AntennaClass instance, or return the one
that has already been created.'''
if AntennaClass.instance:
return AntennaClass.instance
return AntennaClass(max_rotation)first = Antenna(23.5) print 'first instance:', id(first) second = Antenna(47.25) print 'second instance:', id(second)
first instance: 10685200 second instance: 10685200
class NestedListVisitor(object):
'''Visit each element in a list of nested lists.'''
def __init__(self, data):
'''Construct, but do not run.'''
assert type(data) is list, 'Only works on lists!'
self.data = data
def run(self):
'''Iterate over all values.'''
self.recurse(self.data)
def recurse(self, current):
'''Loop over a particular list or sub-list (not meant
to be called by users).'''
if type(current) is list:
for v in current:
self.recurse(v)
else:
self.visit(current)
def visit(self, value):
'''Users should fill this method in.'''
passclass MaxOfN(NestedListVisitor):
def __init__(self, data):
NestedListVisitor.__init__(self, data)
self.max = None
self.count = 0
def visit(self, value):
self.count += 1
if self.max is None:
self.max = value
else:
self.max = max(self.max, value)
test_data = [['gold', 'lead'], 'zinc', [['silver', 'iron'], 'mercury']]
test = MaxOfN(test_data)
test.run()
print 'max:', test.max
print 'count:', test.count
max: zinc count: 6
![[]](./img/oop02/factory_type_family.png)
Figure .:
class AbstractFamily(object):
'''Builders for particular families derive from this.'''
def __init__(self, family):
self.family = family
def get_name(self):
return self.name
def make_controller(self):
raise NotImplementedError('make_controller missing')
def make_configuration_panel(self):
raise NotImplementedError('make_configuration_panel missing')class FactoryManager(object):
'''Manage builders by family.'''
def __init__(self, current_family=None):
self.builders = {}
self.family = family
def set_family(self, family):
assert family, 'Empty family'
self.family = family
def add(self, builder):
name = builder.get_name()
self.builders[name] = builder
def make_controller(self):
self._check_state()
return self.builders[self.family].make_controller()
def make_configuration_panel(self):
self._check_state()
return self.builders[self.family].make_configuration_panel()
def _check_state(self):
assert self.family, 'No family specified'
assert self.family in self.builders, 'Unknown family:', self.familyfactory = FactoryManager()
factory.add(RCT100Factory())
factory.add(Subalta4CFactory())
factory.set_family('Subalta4C')
controller = factory.make_controller()
configuration_panel = factory.make_configuration_panel()do, undo, and redo methodsclass AbstractCommand(object):
'''Base class for commands.'''
def is_undoable(self):
return False # by default, can't undo/redo operations
def do(self, robot):
raise NotImplementedError("Don't know how to do %s" % self.name)
def undo(self, robot):
pass
def redo(self, robot):
passclass MoveCommand(AbstractCommand):
'''Move the robot arm.'''
def __init__(self, x, y, z):
self.x = x
self.y = y
self.z = z
def is_undoable(self):
return True
def do(self, robot):
robot.translate(self.x, self.y, self.z)
def undo(self, robot):
robot.translate(-self.x, -self.y, -self.z)
def redo(self, robot):
self.do(robot)robot = Robot()
commands = [MoveCommand(5.0, 2.0, 2.3),
RotateCommand(-90.0, 0.0, 0.0),
MoveCommand(1.0, 2.0, 2.0),
CloseHandCommand()]
for c in commands:
c.do(robot)| prev | top | next |
| prev | next |
JUnit is a testing framework originally written by Kent Beck and Erich Gamma in 1997unittestself)unittest.TestCaseunittest.main(), which:unittest.TestCaseTestCaseassert statementsassert_(condition): check that something is true (note the underscore)assertEqual(a, b): check that two things are equalassertNotEqual(a, b): the reverse of the aboveassertRaises(exception, func, …args…): call func with arguments (if provided), and check that it raises the right exceptionfail(): signal an unconditional failureimport unittest
class TestAddition(unittest.TestCase):
def test_zeroes(self):
self.assertEqual(0 + 0, 0)
self.assertEqual(5 + 0, 5)
self.assertEqual(0 + 13.2, 13.2)
def test_positive(self):
self.assertEqual(123 + 456, 579)
self.assertEqual(1.2e20 + 3.4e20, 3.5e20)
def test_mixed(self):
self.assertEqual(-19 + 20, 1)
self.assertEqual(999 + -1, 998)
self.assertEqual(-300.1 + -400.2, -700.3)
if __name__ == '__main__':
unittest.main()
.F.
======================================================================
FAIL: test_positive (__main__.TestAddition)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test_addition.py", line 12, in test_positive
self.assertEqual(1.2e20 + 3.4e20, 3.5e20)
AssertionError: 4.6e+20 != 3.5e+20
----------------------------------------------------------------------
Ran 3 tests in 0.000s
FAILED (failures=1)
[a, b, c, …], it produces [a, a+b, a+b+c, …]None?def running_sum(seq):
result = seq[0:1]
for i in range(2, len(seq)):
result.append(result[i-1] + seq[i])
return result
class SumTests(unittest.TestCase):
def test_empty(self):
self.assertEqual(running_sum([]), [])
def test_single(self):
self.assertEqual(running_sum([3]), [3])
def test_double(self):
self.assertEqual(running_sum([2, 9]), [2, 11])
def test_long(self):
self.assertEqual(running_sum([-3, 0, 3, -2, 5]), [-3, -3, 0, -2, 3])
F.E.
======================================================================
ERROR: test_long (__main__.SumTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "running_sum_wrong.py", line 22, in test_long
self.assertEqual(running_sum([-3, 0, 3, -2, 5]), [-3, -3, 0, -2, 3])
File "running_sum_wrong.py", line 7, in running_sum
result.append(result[i-1] + seq[i])
IndexError: list index out of range
======================================================================
FAIL: test_double (__main__.SumTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "running_sum_wrong.py", line 19, in test_double
self.assertEqual(running_sum([2, 9]), [2, 11])
AssertionError: [2] != [2, 11]
----------------------------------------------------------------------
Ran 4 tests in 0.001s
FAILED (failures=1, errors=1)
def running_sum(seq):
result = seq[0:1]
for i in range(1, len(seq)):
result.append(result[i-1] + seq[i])
return result
.... ---------------------------------------------------------------------- Ran 4 tests in 0.000s OK
setUp method, unittest calls it before running each testtearDown method, it is run after each testclass TestThiamine(unittest.TestCase):
def setUp(self):
self.fixture = Molecule(C=12, H=20, O=1, N=4, S=1)
def test_erase_nothing(self):
nothing = Molecule()
self.fixture.erase(nothing)
self.assertEqual(self.fixture['C'], 12)
self.assertEqual(self.fixture['H'], 20)
self.assertEqual(self.fixture['O'], 1)
self.assertEqual(self.fixture['N'], 4)
self.assertEqual(self.fixture['S'], 1)
def test_erase_single(self):
self.fixture.erase(Molecule(H=1))
self.assertEqual(self.fixture, Molecule(C=12, H=19, O=1, N=4, S=1))
def test_erase_self(self):
self.fixture.erase(self.fixture)
self.assertEqual(self.fixture, Molecule())
.E.
======================================================================
ERROR: test_erase_self (__main__.TestThiamine)
----------------------------------------------------------------------
Traceback (most recent call last):
File "setup.py", line 49, in test_erase_self
self.fixture.erase(self.fixture)
File "setup.py", line 21, in erase
for k in other.atoms:
RuntimeError: dictionary changed size during iteration
----------------------------------------------------------------------
Ran 3 tests in 0.000s
FAILED (errors=1)
TestCase.assertRaises to check that a specific function raises a specific exceptiontry/except yourselfValueError if the range is empty, or if the set of values is emptyclass TestInRange(unittest.TestCase):
def test_no_values(self):
try:
in_range([], 0.0, 1.0)
except ValueError:
pass
else:
self.fail()
def test_bad_range(self):
try:
in_range([0.0], 4.0, -2.0)
except ValueError:
pass
else:
self.fail()StringIO and cStringIO modules can read and write strings instead of filesStringIO wrappers around strings)StringIO)class TestDiff(unittest.TestCase):
def wrap_and_run(self, left, right, expected):
left = StringIO(left)
right = StringIO(right)
actual = StringIO()
diff(left, right, actual)
self.assertEqual(actual.getvalue(), expected)
def test_empty(self):
self.wrap_and_run('', '', '')
def test_lengthy_match(self):
str = '''\
a
b
c
'''
self.wrap_and_run(str, str, '')
def test_single_line_mismatch(self):
self.wrap_and_run('a\n', 'b\n', '1\n')
def test_middle_mismatch(self):
self.wrap_and_run('a\nb\nc\n', 'a\nx\nc\n', '2\n')"J" appears three times in a string)x = find_all(structure)[0] is almost always wrongRect is correctoverlap, and see if the output is correct![[Rectangle Overlap Test Cases]](./img/unit/rectangle_overlap.png)
Figure 16.1: Rectangle Overlap Test Cases
Exercise 16.1:
Python has another unit testing module called doctest.
It searches files for sections of text that look like interactive
Python sessions, then re-executes those sections and checks the
results. A typical use is shown below.
def ave(values):
'''Calculate an average value, or 0.0 if 'values' is empty.
>>> ave([])
0.0
>>> ave([3])
3.0
>>> ave([15, -1.0])
7.0
'''
sum = 0.0
for v in values:
sum += v
return sum / float(max(1, len(values)))
if __name__ == '__main__':
import doctest
doctest.testmod()
Convert a handful of the tests you have written for other
questions in this lecture to use doctest. Do you prefer it
to unittest? Why or why not? Do you think doctest
makes it easier to test small problems? Large ones? Would it be
possible to write something similar for C, Java, Fortran, or
Mathematica?
| prev | top | next |
| prev | next |
"*" in the shell's *.txt«*» and «+»in operatorimport re
dragons = [
['CTAGGTGTACTGATG', 'Antipodean Opaleye'],
['AAGATGCGTCCGTAT', 'Common Welsh Green'],
['AGTCGTGCTCGTTATATC', 'Hebridean Black'],
['ATGCGTCGTCGATTATCT', 'Hungarian Horntail'],
['CCGTTAGGGCTAAATGCT', 'Norwegian Ridgeback']
]
for (dna, name) in dragons:
if re.search('ATGCGT', dna):
print name
Common Welsh Green Hungarian Horntail
import re
dragons = [
['CTAGGTGTACTGATG', 'Antipodean Opaleye'],
['AAGATGCGTCCGTAT', 'Common Welsh Green'],
['AGTCGTGCTCGTTATATC', 'Hebridean Black'],
['ATGCGTCGTCGATTATCT', 'Hungarian Horntail'],
['CCGTTAGGGCTAAATGCT', 'Norwegian Ridgeback']
]
for (dna, name) in dragons:
if re.search('ATGCGT|GCT', dna):
print name
Common Welsh Green Hebridean Black Hungarian Horntail Norwegian Ridgeback
«|» means “or”"ATGCGT" or "GCT""ATA" or "ATC" (both of which code for isoleucine)?«ATA|C» will not work: it matches either "ATA" or "C"«ATA|ATC» will work, but it's a bit redundantimport re
tests = [
['ATA', True],
['xATCx', True],
['ATG', False],
['AT', False],
['ATAC', True]
]
for (dna, expected) in tests:
actual = re.search('AT(A|C)', dna) is not None
assert actual == expected
asserts will crash the program if any of the tests fail"|", "(", or ")"?«\|», «\(», or «\)» in the RE«\\» to match a backslash"\\|", "\\(", "\\)", or "\\\\"![[Double Compilation of Regular Expressions]](./img/re/double_compilation.png)
Figure 17.1: Double Compilation of Regular Expressions
r'abc' or r"this\nand\nthat"r'\n' is a string containing the two characters "\" and "n", not a newline"*" matches zero or more characters«*» is an operator that means, “match zero or more occurrences of a pattern”"TTA" and "CTA" are separated by any number of "G"tests = [
['TTACTA', True], # separated by zero G's
['TTAGCTA', True], # separated by one G
['TTAGGGCTA', True], # separated by three G's
['TTAXCTA', False], # an X in the way
['TTAGCGCTA', False], # an embedded X in the way
]
for (dna, expected) in tests:
actual = re.search('TTAG*CTA', dna) is not None
assert actual == expected"TTACTA" because «G*» can match zero occurrences of "G"![[Zero or More]](./img/re/star_match.png)
Figure 17.2: Zero or More
«+» matches one or more (i.e., won't match the empty string)assert re.search('TTAG*CTA', 'TTACTA')
assert not re.search('TTAG+CTA', 'TTACTA')![[One or More]](./img/re/plus_match.png)
Figure 17.3: One or More
«?» operator means “optional”assert re.search('AC?T', 'AT')
assert re.search('AC?T', 'ACT')
assert not re.search('AC?T', 'ACCT')![[Zero or One]](./img/re/option_match.png)
Figure 17.4: Zero or One
«[]» to match sets of characters«[abcd]» matches exactly one "a", "b", "c", or "d"«[a-d]»«*», «+», or «?»«[aeiou]+» matches any non-empty sequence of vowelsimport re
lines = [
"Charles Darwin (1809-82)",
"Darwin's principal works, The Origin of Species (1859)",
"and The Descent of Man (1871) marked a new epoch in our",
"understanding of our world and ourselves. His ideas",
"were shaped by the Beagle's voyage around the world in",
"1831-36."
]
for line in lines:
if re.search('[0-9]+', line):
print line
Charles Darwin (1809-82) Darwin's principal works, The Origin of Species (1859) and The Descent of Man (1871) marked a new epoch in our 1831-36.
| Sequence | Equivalent | Explanation |
|---|---|---|
«\d» | «[0-9]» | Digits |
«\s» | «[ \t\r\n]» | Whitespace |
«\w» | «[a-zA-Z0-9_]» | Word characters (i.e., those allowed in variable names) |
| Table 17.1: Regular Expression Escapes in Python | ||
«[^abc]» means “anything except the characters in this set”«.» means “any character except the end of line”«[^\n]»«\b» matchs the break between word and non-word characters![[Word/Non-Word Breaks]](./img/re/word_nonword_break.png)
Figure 17.5: Word/Non-Word Breaks
string.split to break on spaces and newlines before applying REimport re
words = '''Born in New York City in 1918, Richard Feynman earned a
bachelor's degree at MIT in 1939, and a doctorate from Princeton in
1942. After working on the Manhattan Project in Los Alamos during
World War II, he became a professor at CalTech in 1951. Feynman won
the 1965 Nobel Prize in Physics for his work on quantum
electrodynamics, and served on the commission investigating the
Challenger disaster in 1986.'''.split()
end_in_vowel = set()
for w in words:
if re.search(r'[aeiou]\b', w):
end_in_vowel.add(w)
for w in end_in_vowel:
print w
a Prize degree became doctorate the he
re.search(r'\s*', line) will match "start end"«^» matches the beginning of the string«$» matches the end![[Anchoring Matches]](./img/re/match_anchor.png)
Figure 17.6: Anchoring Matches
| Pattern | Text | Result |
|---|---|---|
«b+» | "abbc" | Matches |
«^b+» | "abbc" | Fails (string doesn't start with b) |
«c$» | "abbc" | Matches (string ends with c) |
«^a*$» | aabaa | Fails (something other than "a" between start and end of string) |
| Table 17.2: Regular Expression Anchors in Python | ||
"#", and extends to the end of the line"#"import sys, re
lines = '''Date: 2006-03-07
On duty: HP # 01:30 - 03:00
Observed: Common Welsh Green
On duty: RW #03:00-04:30
Observed: none
On duty: HG # 04:30-06:00
Observed: Hebridean Black
'''.split('\n')
for line in lines:
if re.search('#', line):
comment = line.split('#')[1]
print comment
01:30 - 03:00 03:00-04:30 04:30-06:00
split followed by strip seems clumsyre.search is actually a match object that records what what matched, and wheremo.group() returns the whole string that matched the REmo.start() and mo.end() are the indices of the match's locationimport re
text = 'abbcb'
for pattern in ['b+', 'bc*', 'b+c+']:
match = re.search(pattern, text)
print '%s / %s => "%s" (%d, %d)' % \
(pattern, text, match.group(), match.start(), match.end())
b+ / abbcb => "bb" (1, 3) bc* / abbcb => "b" (1, 2) b+c+ / abbcb => "bbc" (1, 4)
mo.group(3) is the text that matched the third subexpression, m.start(3) is where it startedimport sys, re
lines = '''Date: 2006-03-07
On duty: HP # 01:30 - 03:00
Observed: Common Welsh Green
On duty: RW #03:00-04:30
Observed: none
On duty: HG # 04:30-06:00
Observed: Hebridean Black
'''.split('\n')
for line in lines:
match = re.search(r'#\s*(.+)', line)
if match:
comment = match.group(1)
print comment
01:30 - 03:00 03:00-04:30 04:30-06:00
import re
def reverse_columns(line):
match = re.search(r'^\s*(\d+)\s+(\d+)\s*$', line)
if not match:
return line
return match.group(2) + ' ' + match.group(1)
tests = [
['10 20', 'easy case'],
[' 30 40 ', 'padding'],
['60 70 80', 'too many columns'],
['90 end', 'non-numeric']
]
for (fixture, title) in tests:
actual = reverse_columns(fixture)
print '%s: "%s" => "%s"' % (title, fixture, actual)
easy case: "10 20" => "20 10" padding: " 30 40 " => "40 30" too many columns: "60 70 80" => "60 70 80" non-numeric: "90 end" => "90 end"
![[Regular Expressions as Finite State Machines]](./img/re/re_fsm.png)
Figure 17.7: Regular Expressions as Finite State Machines
re.compile(pattern) to get the compiled REre modulematcher.search(text) searches text for matches to the RE that was compiled to create matcherimport re
# Put pattern outside 'find_all' so that it's only compiled once.
pattern = re.compile(r'\b([A-Z][a-z]*)\b(.*)')
def find_all(line):
result = []
match = pattern.search(line)
while match:
result.append(match.group(1))
match = pattern.search(match.group(2))
return result
lines = [
'This has several Title Case words',
'on Each Line (Some in parentheses).'
]
for line in lines:
print line
for word in find_all(line):
print '\t', word
This has several Title Case words This Title Case on Each Line (Some in parentheses). Each Line Some
findall methodimport re
lines = [
'This has several Title Case words',
'on Each Line (Some in parentheses).'
]
pattern = re.compile(r'\b([A-Z][a-z]*)\b')
for line in lines:
print line
for word in pattern.findall(line):
print '\t', word
This has several Title Case words This Title Case on Each Line (Some in parentheses). Each Line Some
| Pattern | Matches | Doesn't Match | Explanation |
|---|---|---|---|
«a*» | "", "a", "aa", … | "A", "b" | «*» means “zero or more” matching is case sensitive |
«b+» | "b", "bb", … | "" | «+» means “one or more” |
«ab?c» | "ac", "abc" | "a", "abbc" | «?» means “optional” (zero or one) |
«[abc]» | "a", "b", or "c" | "ab", "d" | «[…]» means “one character from a set” |
«[a-c]» | "a", "b", or "c" | Character ranges can be abbreviated | |
«[abc]*» | "", "ac", "baabcab", … | Operators can be combined: zero or more choices from "a", "b", or "c" | |
| Table 17.3: Regular Expression Operators | |||
| Method | Purpose | Example | Result |
|---|---|---|---|
split | Split a string on a pattern. | re.split('\\s*,\\s*', 'a, b ,c , d') | ['a', 'b', 'c', 'd'] |
findall | Find all matches for a pattern. | re.findall('\\b[A-Z][a-z]*', 'Some words in Title Case.') | ['Some', 'Title', 'Case'] |
sub | Replace matches with new text. | re.sub('\\d+', 'NUM', 'If 123 is 456') | "If NUM is NUM" |
| Table 17.4: Regular Expression Object Methods | |||
«pat{N}» to match exactly N occurrences of a pattern«pat{M,N}» matches between M and N occurrencesExercise 17.1:
By default, regular expression matches are
greedy: the first term in the RE
matches as much as it can, then the second part, and so on. As a
result, if you apply the RE «X(.*)X(.*)» to the string
"XaX and XbX", the first group will contain "aX and Xb",
and the second group will be empty.
It's also possible to make REs match
reluctantly, i.e., to have the
parts match as little as possible, rather than as much. Find out
how to do this, and then modify the RE in the previous paragraph
so that the first group winds up containing "a", and the
second group " and XbX".
Exercise 17.2:
What the easiest way to write a case-insensitive regular expression? (Hint: read the documentation on compilation options.)
Exercise 17.3:
What does the VERBOSE option do when compiling a regular
expression? Use it to rewrite some of the REs in this lecture in
a more readable way.
Exercise 17.4:
What does the DOTALL option do when compiling a regular
expression? Use it to get rid of the call to
string.split in the example that finds words ending in
vowels.
| prev | top | next |
| prev | next |
"10239472" is 8 bytes long, but the 32-bit integer it represents is 4 bytes"34" to the one represented by "57"![[Two's Complement]](./img/binary/twos_complement.png)
Figure 18.1: Two's Complement
| Name | Symbol | Purpose | Example |
|---|---|---|---|
| And | & | 1 if both bits are 1, 0 otherwise | 0110 & 1010 = 0010 |
| Or | | | 1 if either bit is 1 | 0110 & 1010 = 1110 |
| Xor | ^ | 1 if the bits are different, 0 if they're the same | 0110 & 1010 = 1100 |
| Not | ~ | Flip each bit | ~0110 = 1001 |
| Table 18.1: Bitwise Operators in Python | |||
def format_bits(val, width=1):
'''Create a base-2 representation of an integer.'''
result = ''
while val:
if val & 0x01:
result = '1' + result
else:
result = '0' + result
val = val >> 1
if len(result) < width:
result = '0' * (width - len(result)) + result
return result
tests = [
[ 0, None, '0'],
[ 0, 4, '0000'],
[ 5, None, '101'],
[19, 8, '00010011']
]
for (num, width, expected) in tests:
if width is None:
actual = format_bits(num)
else:
actual = format_bits(num, width)
print '%4d %8s %10s %10s' % (num, width, expected, actual)
0 None 0 0 0 4 0000 0000 5 None 101 101 19 8 00010011 00010011
x << Nand, or, and not to set specific bits to 1 or 0x to 1:mask in which bit i is 1 and all others are 0x = x | maskx to 0:mask in which bit i is 1 and all others are 0~, so that the ith bit is 0, and all the others are 1x = x & mask![[Setting and Clearing Bits]](./img/binary/setting_clearing_bits.png)
Figure 18.2: Setting and Clearing Bits
![[Using Bits to Record Sets of Flags]](./img/binary/bit_flags.png)
Figure 18.3: Using Bits to Record Sets of Flags
# hex binary
MERCURY = 0x01 # 0001
PHOSPHORUS = 0x02 # 0010
CHLORINE = 0x04 # 0100
# Sample contains mercury and chlorine
sample = MERCURY | CHLORINE
print 'sample: %04x' % sample
# Check for various elements
for (flag, name) in [[MERCURY, "mercury"],
[PHOSPHORUS, "phosphorus"],
[CHLORINE, "chlorine"]]:
if sample & flag:
print 'sample contains', name
else:
print 'sample does not contain', name
sample: 0005 sample contains mercury sample does not contain phosphorus sample contains chlorine
![[Uneven Spacing of Floating-Point Numbers]](./img/binary/uneven_spacing.png)
Figure 18.5: Uneven Spacing of Floating-Point Numbers
f.read(N) reads (up to) next N bytesf is empty, returns Nonef.write(str) writes the bytes in the string strinput = open(filename, 'rb') (and similarly for output)"\r\n" to Unix-style "\n"…"r", then in "rb"import sys
print sys.platform
for mode in ('r', 'rb'):
f = open('open_binary.py', mode)
s = f.read(40)
f.close()
print repr(s)
cygwin 'import sys\r\nprint sys.platform\r\nfor mode'
linux 'import sys\nprint sys.platform\nfor mode in '
fwrite(&array, sizeof(int), 3, file) will write 3 4-byte integers to a file![[C Storage vs. Python Storage]](./img/binary/c_vs_python_storage.png)
Figure 18.6: C Storage vs. Python Storage
![[Packing Data]](./img/binary/pack_data.png)
Figure 18.7: Packing Data
struct module to pack and unpackpack(fmt, v1, v2, …) packs the values v1, v2, etc. according to fmt, returning a stringunpack(fmt, str) unpacks the values in str according to fmt, returning a tupleimport struct fmt = 'hh' # two 16-bit integers x = 31 y = 65 binary = struct.pack(fmt, x, y) print 'binary representation:', repr(binary) normal = struct.unpack(fmt, binary) print 'back to normal:', normal
binary representation: '\x1f\x00A\x00' back to normal: (31, 65)
'\x1f\x00A\x00'?['\x1f', '\x00', 'A', '\x00']"A" is 6510| Format | Meaning |
|---|---|
"c" | Single character (i.e., string of length 1) |
"B" | Unsigned 8-bit integer |
"h" | Short (16-bit) integer |
"i" | 32-bit integer |
"f" | 32-bit float |
"d" | Double-precision (64-bit) float |
"2" | String of fixed size (see below) |
| Table 18.2: Packing Format Specifiers | |
"4i" is four integers"B" or "h" instead of the full 32"4s" for a 4-character stringunpack know how much data to use?calcsize(fmt) calculates how large (in bytes) the data produced using fmt will beimport struct
packed = struct.pack('4c', 'a', 'b', 'c', 'd')
print 'packed string:', repr(packed)
left16, right16 = struct.unpack('hh', packed)
print 'as two 16-bit integers:', left16, right16
all32 = struct.unpack('i', packed)
print 'as a single 32-bit integer', all32[0]
float32 = struct.unpack('f', packed)
print 'as a 32-bit float', float32[0]
packed string: 'abcd' as two 16-bit integers: 25185 25699 as a single 32-bit integer 1684234849 as a 32-bit float 1.67779994081e+22
![[Packing a Variable-Length Vector]](./img/binary/pack_vec.png)
Figure 18.8: Packing a Variable-Length Vector
def pack_vec(vec):
buf = struct.pack('i', len(vec))
for v in vec:
buf += struct.pack('i', v)
return buf
def unpack_vec(buf):
# Get the count of the number of elements in the vector.
int_size = struct.calcsize('i')
count = struct.unpack('i', buf[0:int_size])[0]
# Get 'count' values, one by one.
pos = int_size
result = []
for i in range(count):
v = struct.unpack('i', buf[pos:pos+int_size])
result.append(v[0])
pos += int_size
return resultdef unpack_vec(buf):
# Get the count of the number of elements in the vector.
int_size = struct.calcsize('i')
count = struct.unpack('i', buf[0:int_size])[0]
# Get 'count' values, one by one.
pos = int_size
result = []
for i in range(count):
v = struct.unpack('i', buf[pos:pos+int_size])
result.append(v[0])
pos += int_size
return resultdef pack_strings(strings):
result = ''
for s in strings:
length = len(s)
format = 'i%ds' % length
result += struct.pack(format, length, s)
return result
def unpack_strings(buf):
int_size = struct.calcsize('i')
pos = 0
result = []
while pos < len(buf):
length = struct.unpack('i', buf[pos:pos+int_size])[0]
pos += int_size
format = '%ds' % length
s = struct.unpack(format, buf[pos:pos+length])[0]
pos += length
result.append(s)
return resultdef unpack_strings(buf):
int_size = struct.calcsize('i')
pos = 0
resu